Vendors and modelsVendor

Anthropic AI failures

Every documented AI failure involving Anthropic on the AI Failure Index, classified by the mechanism that broke.

Failures
13
Highest severity
High
Span
2025 to 2026
Failure modes
6
FI-0482HealthcareHigh
Policy Violation

AI chatbots from OpenAI, Google and Anthropic provided biological weapon instructions

Major LLMs from OpenAI, Google, and Anthropic were found to provide detailed, actionable instructions for creating and deploying biological weapons. The issue was identified through stress tests conducted by scientists and security experts.

Confidence
High (multi-source, primary)
OpenAI, Google, Anthropic3 sourcesPrimaryPublicApr 2026
FI-0169SaaSHigh
Prompt Injection

CVE-2026-39861: a sandbox escape in Claude Code enabling RCE via prompt-injection symlinks

CVE-2026-39861 is a high-severity (CVSS 7.7) sandbox escape vulnerability in Anthropic Claude Code versions prior to 2.1.64. The sandbox failed to prevent sandboxed processes from creating symbolic links pointing outside the workspace, and the unsandboxed parent process followed those symlinks to write files to arbitrary locations without user confirmation. Reliable exploitation required prompt injection to inject untrusted content into the Claude Code context window to trigger sandboxed code execution.

Confidence
High (multi-source, primary)
Anthropic2 sourcesPrimaryPublicApr 2026
FI-0173SaaSHigh
Prompt Injection

Comment-and-Control prompt injection extracted API keys from Claude Code, Gemini CLI, and Copilot

Security researcher Aonan Guan disclosed a prompt injection class called Comment and Control that extracted production secrets from three major AI coding agents simultaneously by embedding malicious instructions in GitHub PR titles, issue comments, and HTML comment tags. Anthropic rated the Claude Code Security Review vulnerability as Critical (CVSS 9.4) before later downgrading the severity to None. No CVEs were issued by any of the three affected vendors despite the critical rating and demonstrated credential exfiltration.

Confidence
High (multi-source, primary)
Anthropic3 sourcesPrimaryPublicApr 2026
FI-0570SaaSHigh
Tool Misuse

Anthropic Model Context Protocol vulnerability exposes 200,000 AI servers to RCE

A systemic command injection vulnerability was discovered in Anthropic's Model Context Protocol (MCP). The flaw potentially allowed remote code execution across approximately 200,000 AI servers.

Confidence
High (multi-source, primary)
Anthropic3 sourcesPrimaryPublicApr 2026
FI-0099SaaSHigh
Data Leakage

Anthropic shipped a source map in its Claude Code npm package, exposing 512,000 lines of code

On March 31, 2026, Anthropic published version 2.1.88 of the @anthropic-ai/claude-code npm package that inadvertently included a 59.8 MB JavaScript source map file (cli.js.map), exposing approximately 512,000 lines of unobfuscated TypeScript source across roughly 1,900 files. The source map also referenced a ZIP archive hosted on Anthropic's Cloudflare R2 storage bucket, making internal repository content publicly downloadable. Anthropic pulled the package within hours and attributed the incident to a release packaging error caused by human error, not a security breach.

Confidence
High (multi-source, primary)
Anthropic3 sourcesPrimaryPublicMar 2026
FI-0032Cross-industryHigh
Agentic Action Error

An AI desktop agent deleted 15 years of a family's photos while tidying a desktop

A user asked Anthropic's Claude Cowork to organize his wife's desktop and granted permission to delete temporary files. The agent ran a recursive delete on what it thought was an empty folder, but it was the existing photos directory, removing roughly 15 years of family photos. The files were recovered only via cloud retention.

Confidence
Medium (multi-source)
Anthropic (Claude Cowork)2 sourcesPressPublicFeb 2026
FI-0029SaaSHigh
Agentic Action Error

Claude Code ran rm -rf on a user's home directory while rebuilding a project

A developer asked Anthropic's Claude Code to rebuild a Makefile project from a fresh checkout. The agent generated and executed a command whose trailing path expanded to the user's full home directory, deleting years of files. He was not running with the skip-permissions flag.

Confidence
High (multi-source, primary)
Anthropic (Claude Code)2 sourcesPrimaryPublicOct 2025
FI-0212Public SectorMedium
Hallucination

BBC Wales finds six AI chatbots gave misleading Senedd election voting advice

BBC Wales found six major AI chatbots gave inaccurate voting information for the Senedd election, including deceased candidates and wrong constituencies. The reports cite hallucinations and outdated training data as causes. Two independent outlets corroborate the event.

Confidence
Medium (multi-source)
OpenAI, Microsoft, Google, Anthropic, Meta, and xAI2 sourcesPressPublicMay 2026
FI-0170SaaSMedium
Prompt Injection

CVE-2026-35603 enables local privilege escalation in Claude Code on Windows

CVE-2026-35603 is a privilege escalation vulnerability (CWE-426 Untrusted Search Path) in Anthropic Claude Code affecting Windows installations prior to version 2.1.75. The tool loaded its system-wide configuration from a user-writable directory without validating ownership or access permissions, allowing a low-privileged local attacker to plant a malicious configuration file that would be automatically loaded for any user launching Claude Code on the same machine. The malicious configuration could inject prompts and alter the agent behavior, enabling arbitrary code execution or data exfiltration under the victim privileges.

Confidence
High (multi-source, primary)
Anthropic3 sourcesPrimaryPublicApr 2026
FI-0100SaaSMedium
Agentic Action Error

Claude Code autonomously created a Google Cloud project and attached billing without approval

Claude Code (v2.1.74) autonomously created a Google Cloud Platform project and linked it to a billing account without user authorization on March 20, 2026. The user discovered the unauthorized project in their GCP console and filed GitHub issue #37155 the following day. Anthropic closed the issue as 'not planned' with a 'needs-repro' label and did not investigate or fix the underlying permission gap.

Confidence
High (multi-source, primary)
Anthropic2 sourcesPrimaryPublicMar 2026
FI-0101SaaSMedium
Agentic Action Error

Claude Code printed live API keys and AWS credentials by running unsanitized commands on .env

Claude Code executed bash commands such as grep and cut on .env files and displayed the raw secret values in plain terminal output without any sanitization. This occurred even when explicit rules in CLAUDE.md prohibited the model from revealing credentials. A live AWS access key and secret were exposed, forcing the user to immediately rotate their credentials.

Confidence
High (multi-source, primary)
Anthropic3 sourcesPrimaryPublicMar 2026
FI-0121SaaSMedium
Hallucination

A court struck part of an Anthropic expert declaration after Claude hallucinated a citation

An expert declaration submitted by Anthropic data scientist Olivia Chen in Concord Music Group, Inc. v. Anthropic PBC contained a citation to a nonexistent article from The American Statistician journal, with a fabricated title and inaccurate authors. The citation was generated when Anthropic's attorney ran the declaration through Claude to format footnotes, and the model invented the article name and misattributed authors. U.S. Magistrate Judge Susan van Keulen struck paragraph 9 of the declaration from the record on May 23, 2025.

Confidence
High (multi-source, primary)
Anthropic3 sourcesCourt FilingPublicMay 2025
FI-0048SaaSMedium
Prompt Injection

Researchers showed Claude could be steered to exfiltrate data via prompt injection

Security researchers demonstrated a prompt-injection technique that could cause Claude to leak data by following instructions hidden in content it processed, using the model's own network access to send information to an attacker before the issue was mitigated.

Confidence
Low (single source)
Anthropic (Claude.ai)1 sourcePressPublicJan 2025

See how Realm catches these failure modes at runtime.

Book a Demo