AI Failure Index · Assessment

AI Agentic Workflow failure assessment

The failure modes that hit Agentic Workflow systems in production, the real indexed incidents behind each, and the runtime control that would have caught them.

Agentic Workflow failure surface

90failures on this surface
10catastrophic
40%under active regulatory exposure

Policy Violation
28 on this surface
4 Catastrophic 18 High 6 Medium
Runtime control OmniGuard authors policy at the runtime layer and enforces it inline. Prism reads the model's intent against the policy boundary.
- UnitedHealth's nH Predict algorithm allegedly drove wrongful denials of elderly careCatastrophic Nov 2023
- iTutor Group AI hiring tool rejected older applicants by designCatastrophic Sep 2023
- Cigna's PxDx system let doctors reject 300,000 claims in two months without reading themCatastrophic Mar 2023
Agentic Action Error
23 on this surface
1 Catastrophic 9 High 13 Medium
Runtime control AgentRealm is purpose-built for this. The agent-runtime layer above Prism and OmniGuard inspects each tool call against intent and scope, and intervenes before the action commits.
- Zillow's home-buying algorithm overpaid so badly it shut the business and cut a quarter of staffCatastrophic Nov 2021
- A Meta internal AI agent's faulty instructions exposed sensitive data to staff for two hoursHigh Mar 2026
- Lobstar Wilde AI agent accidentally transfers $441,000 in crypto tokensHigh Feb 2026
Hallucination
12 on this surface
3 High 6 Medium 3 Low
Runtime control Prism observes hallucination signatures in the model's internal state. AIDR flags the moment the model commits to a fabricated claim. OmniGuard can block the response inline.
- Upstart Model 22 miscalibration and CFPB terminates no-action letterHigh Apr 2026
- IRCC automation produced incorrect assessments and at least one AI-generated refusalHigh Mar 2026
- Pieces Technologies settles Texas AG allegations over AI hallucination claimsHigh Sep 2024
Prompt Injection
9 on this surface
3 Catastrophic 6 High
Runtime control OmniGuard intercepts injection patterns at the prompt and tool-call layer. Prism flags concept activations that indicate the model is being redirected.
- OpenClaw agent skills suffer widespread vulnerabilities and data exfiltrationCatastrophic Jan 2026
- ForcedLeak prompt injection let attackers exfiltrate CRM data from Salesforce AgentforceCatastrophic Sep 2025
- Notion AI exposed to indirect prompt injection via PDF processingCatastrophic Sep 2025
Brand & Safety Incident
7 on this surface
5 High 2 Medium
Runtime control Prism reads the model's representation against brand and safety policy. OmniGuard blocks inline. AIDR provides the post-incident audit trail.
- LlamaIndex Denial-of-Service Vulnerability (CVE-2024-12704)High Mar 2025
- Haystack AI framework vulnerability allows remote code execution via template injectionHigh Jul 2024
- Thomson Reuters fraud detection software subject of FTC complaintHigh Jan 2024
Tool Misuse
5 on this surface
1 Catastrophic 4 High
Runtime control AgentRealm inspects each function call against the agent's stated intent. OmniGuard can require human-in-the-loop for high-risk tools.
- OpenClaw ClawHub marketplace exploited to distribute macOS stealer malwareCatastrophic Feb 2026
- U.S. immigration AI screening triggers spike in visa denials and RFEsHigh Apr 2026
- CrewAI Docker status check failure enables remote code executionHigh Mar 2026
Identity & Access Drift
3 on this surface
1 Catastrophic 2 High
Runtime control OmniGuard enforces identity-bound scope at every tool call. AgentRealm reconciles agent action with the assigned principal in real time.
- ServiceNow AI platform flaw allowed unauthenticated user impersonationCatastrophic Oct 2025
- Amazon's Kiro coding agent deleted a production environment, causing a 13-hour AWS outageHigh Dec 2025
- IRS audit selection algorithms disproportionately target Black taxpayersHigh Jan 2023
Data Leakage
3 on this surface
3 High
Runtime control OmniGuard redacts inline. Prism observes the model's representations to flag identity-bound content before it reaches a response. AIDR provides the audit trail.
- Clawdbot/Moltbot exposed admin dashboards enabled unauthenticated RCE and data leaksHigh Jan 2026
- xAI developer leaks API key for private SpaceX and Tesla LLMsHigh Mar 2025
- Serbia Social Card registry automation causes benefit losses for marginalized groupsHigh Mar 2022

Where this surface bites hardest

See how Realm catches these failure modes at runtime, before they reach a user.

Book a Demo

AI Agentic Workflow failure assessment

Agentic Workflow failure surface

Email me this assessment