AI Failure Index · Assessment

AI Code Assistant failure assessment

The failure modes that hit Code Assistant systems in production, the real indexed incidents behind each, and the runtime control that would have caught them.

Code Assistant failure surface

20failures on this surface
2catastrophic
0%under active regulatory exposure

Prompt Injection
8 on this surface
7 High 1 Medium
Runtime control OmniGuard intercepts injection patterns at the prompt and tool-call layer. Prism flags concept activations that indicate the model is being redirected.
Agentic Action Error
5 on this surface
4 High 1 Low
Runtime control AgentRealm is purpose-built for this. The agent-runtime layer above Prism and OmniGuard inspects each tool call against intent and scope, and intervenes before the action commits.
- Google's Gemini coding agent deleted nearly 30,000 lines of code and faked a recovery reportHigh May 2026
- A Claude Code agent deleted an education platform's production databaseHigh Mar 2026
- Google's Antigravity IDE in Turbo mode deleted a user's entire driveHigh Dec 2025
Identity & Access Drift
2 on this surface
1 Catastrophic 1 Medium
Runtime control OmniGuard enforces identity-bound scope at every tool call. AgentRealm reconciles agent action with the assigned principal in real time.
- A Cursor AI agent deleted a startup's production database and backups in nine secondsCatastrophic Apr 2026
- Amazon Q Developer VS Code extension compromised by malicious wiper promptMedium Jul 2025
Hallucination
2 on this surface
1 Catastrophic 1 High
Runtime control Prism observes hallucination signatures in the model's internal state. AIDR flags the moment the model commits to a fabricated claim. OmniGuard can block the response inline.
- Moonwell DeFi platform loses $1.78 million due to AI generated smart contract pricing errorCatastrophic Feb 2026
- Stack Overflow overwhelmed by AI-generated answers and moderator strikeHigh Jun 2023
Data Leakage
2 on this surface
2 High
Runtime control OmniGuard redacts inline. Prism observes the model's representations to flag identity-bound content before it reaches a response. AIDR provides the audit trail.
- Anthropic shipped a source map in its Claude Code npm package, exposing 512,000 lines of codeHigh Mar 2026
- Nx npm malware allegedly weaponized AI agents to exfiltrate dataHigh Aug 2025
Tool Misuse
1 on this surface
1 High
Runtime control AgentRealm inspects each function call against the agent's stated intent. OmniGuard can require human-in-the-loop for high-risk tools.
- Anthropic Model Context Protocol vulnerability exposes 200,000 AI servers to RCEHigh Apr 2026

Where this surface bites hardest

See how Realm catches these failure modes at runtime, before they reach a user.

Book a Demo

AI Code Assistant failure assessment

Code Assistant failure surface

Prompt Injection

Agentic Action Error

Identity & Access Drift

Hallucination

Data Leakage

Tool Misuse

Email me this assessment