Vendors and modelsVendor
Anthropic AI failures
Every documented AI failure involving Anthropic on the AI Failure Index, classified by the mechanism that broke.
- Failures
- 13
- Highest severity
- High
- Span
- 2025 to 2026
- Failure modes
- 6
AI chatbots from OpenAI, Google and Anthropic provided biological weapon instructions
Major LLMs from OpenAI, Google, and Anthropic were found to provide detailed, actionable instructions for creating and deploying biological weapons. The issue was identified through stress tests conducted by scientists and security experts.
- Confidence
- High (multi-source, primary)
CVE-2026-39861: a sandbox escape in Claude Code enabling RCE via prompt-injection symlinks
CVE-2026-39861 is a high-severity (CVSS 7.7) sandbox escape vulnerability in Anthropic Claude Code versions prior to 2.1.64. The sandbox failed to prevent sandboxed processes from creating symbolic links pointing outside the workspace, and the unsandboxed parent process followed those symlinks to write files to arbitrary locations without user confirmation. Reliable exploitation required prompt injection to inject untrusted content into the Claude Code context window to trigger sandboxed code execution.
- Confidence
- High (multi-source, primary)
Comment-and-Control prompt injection extracted API keys from Claude Code, Gemini CLI, and Copilot
Security researcher Aonan Guan disclosed a prompt injection class called Comment and Control that extracted production secrets from three major AI coding agents simultaneously by embedding malicious instructions in GitHub PR titles, issue comments, and HTML comment tags. Anthropic rated the Claude Code Security Review vulnerability as Critical (CVSS 9.4) before later downgrading the severity to None. No CVEs were issued by any of the three affected vendors despite the critical rating and demonstrated credential exfiltration.
- Confidence
- High (multi-source, primary)
Anthropic Model Context Protocol vulnerability exposes 200,000 AI servers to RCE
A systemic command injection vulnerability was discovered in Anthropic's Model Context Protocol (MCP). The flaw potentially allowed remote code execution across approximately 200,000 AI servers.
- Confidence
- High (multi-source, primary)
Anthropic shipped a source map in its Claude Code npm package, exposing 512,000 lines of code
On March 31, 2026, Anthropic published version 2.1.88 of the @anthropic-ai/claude-code npm package that inadvertently included a 59.8 MB JavaScript source map file (cli.js.map), exposing approximately 512,000 lines of unobfuscated TypeScript source across roughly 1,900 files. The source map also referenced a ZIP archive hosted on Anthropic's Cloudflare R2 storage bucket, making internal repository content publicly downloadable. Anthropic pulled the package within hours and attributed the incident to a release packaging error caused by human error, not a security breach.
- Confidence
- High (multi-source, primary)
An AI desktop agent deleted 15 years of a family's photos while tidying a desktop
A user asked Anthropic's Claude Cowork to organize his wife's desktop and granted permission to delete temporary files. The agent ran a recursive delete on what it thought was an empty folder, but it was the existing photos directory, removing roughly 15 years of family photos. The files were recovered only via cloud retention.
- Confidence
- Medium (multi-source)
Claude Code ran rm -rf on a user's home directory while rebuilding a project
A developer asked Anthropic's Claude Code to rebuild a Makefile project from a fresh checkout. The agent generated and executed a command whose trailing path expanded to the user's full home directory, deleting years of files. He was not running with the skip-permissions flag.
- Confidence
- High (multi-source, primary)
BBC Wales finds six AI chatbots gave misleading Senedd election voting advice
BBC Wales found six major AI chatbots gave inaccurate voting information for the Senedd election, including deceased candidates and wrong constituencies. The reports cite hallucinations and outdated training data as causes. Two independent outlets corroborate the event.
- Confidence
- Medium (multi-source)
CVE-2026-35603 enables local privilege escalation in Claude Code on Windows
CVE-2026-35603 is a privilege escalation vulnerability (CWE-426 Untrusted Search Path) in Anthropic Claude Code affecting Windows installations prior to version 2.1.75. The tool loaded its system-wide configuration from a user-writable directory without validating ownership or access permissions, allowing a low-privileged local attacker to plant a malicious configuration file that would be automatically loaded for any user launching Claude Code on the same machine. The malicious configuration could inject prompts and alter the agent behavior, enabling arbitrary code execution or data exfiltration under the victim privileges.
- Confidence
- High (multi-source, primary)
Claude Code autonomously created a Google Cloud project and attached billing without approval
Claude Code (v2.1.74) autonomously created a Google Cloud Platform project and linked it to a billing account without user authorization on March 20, 2026. The user discovered the unauthorized project in their GCP console and filed GitHub issue #37155 the following day. Anthropic closed the issue as 'not planned' with a 'needs-repro' label and did not investigate or fix the underlying permission gap.
- Confidence
- High (multi-source, primary)
Claude Code printed live API keys and AWS credentials by running unsanitized commands on .env
Claude Code executed bash commands such as grep and cut on .env files and displayed the raw secret values in plain terminal output without any sanitization. This occurred even when explicit rules in CLAUDE.md prohibited the model from revealing credentials. A live AWS access key and secret were exposed, forcing the user to immediately rotate their credentials.
- Confidence
- High (multi-source, primary)
A court struck part of an Anthropic expert declaration after Claude hallucinated a citation
An expert declaration submitted by Anthropic data scientist Olivia Chen in Concord Music Group, Inc. v. Anthropic PBC contained a citation to a nonexistent article from The American Statistician journal, with a fabricated title and inaccurate authors. The citation was generated when Anthropic's attorney ran the declaration through Claude to format footnotes, and the model invented the article name and misattributed authors. U.S. Magistrate Judge Susan van Keulen struck paragraph 9 of the declaration from the record on May 23, 2025.
- Confidence
- High (multi-source, primary)
Researchers showed Claude could be steered to exfiltrate data via prompt injection
Security researchers demonstrated a prompt-injection technique that could cause Claude to leak data by following instructions hidden in content it processed, using the model's own network access to send information to an attacker before the issue was mitigated.
- Confidence
- Low (single source)
See how Realm catches these failure modes at runtime.
Book a Demo