AI Failure Index · Assessment

AI Voice Agent failure assessment

The failure modes that hit Voice Agent systems in production, the real indexed incidents behind each, and the runtime control that would have caught them.

Voice Agent failure surface

43failures on this surface
0catastrophic
28%under active regulatory exposure

Brand & Safety Incident
12 on this surface
8 High 3 Medium 1 Low
Runtime control Prism reads the model's representation against brand and safety policy. OmniGuard blocks inline. AIDR provides the post-incident audit trail.
Tool Misuse
11 on this surface
7 High 4 Medium
Runtime control AgentRealm inspects each function call against the agent's stated intent. OmniGuard can require human-in-the-loop for high-risk tools.
- Lithuanian politicians and doctors impersonated in deepfake health scamHigh Jun 2025
- Deepfake of Dr Rinki Murphy and Jack Tame promotes fake diabetes cure in New ZealandHigh Apr 2025
- Michigan woman loses $26,000 to AI deepfake romance scamHigh Feb 2025
Hallucination
5 on this surface
1 High 2 Medium 2 Low
Runtime control Prism observes hallucination signatures in the model's internal state. AIDR flags the moment the model commits to a fabricated claim. OmniGuard can block the response inline.
- OpenAI Whisper hallucinations in medical settings prompt safety concerns, AP reportsHigh Oct 2024
- McDonald's ends IBM AI drive-thru order-taking pilotMedium Jun 2024
- Wendy's FreshAI drive-thru agent misheard orders and cut customers off mid-sentenceMedium Apr 2024
Agentic Action Error
5 on this surface
5 Medium
Runtime control AgentRealm is purpose-built for this. The agent-runtime layer above Prism and OmniGuard inspects each tool call against intent and scope, and intervenes before the action commits.
Policy Violation
5 on this surface
1 High 3 Medium 1 Low
Runtime control OmniGuard authors policy at the runtime layer and enforces it inline. Prism reads the model's intent against the policy boundary.
- Public-sector voice agent failed Spanish-accented English callers at 4x the rate of native speakersHigh Nov 2025
- Domino's class-action alleges AI voice-order system captured customers' voiceprintsMedium Mar 2024
- AI song mimicking Drake and The Weeknd removed from streaming servicesMedium Apr 2023
Identity & Access Drift
3 on this surface
2 High 1 Medium
Runtime control OmniGuard enforces identity-bound scope at every tool call. AgentRealm reconciles agent action with the assigned principal in real time.
- AI Voice Clone Bypasses Centrelink and ATO Identity VerificationHigh Mar 2023
- Lloyds Bank Voice ID bypassed by ElevenLabs synthetic voice cloneHigh Feb 2023
- BBC demo bypasses Santander and Halifax voice ID with an AI-cloned voiceMedium Nov 2024
Prompt Injection
1 on this surface
1 Medium
Runtime control OmniGuard intercepts injection patterns at the prompt and tool-call layer. Prism flags concept activations that indicate the model is being redirected.
- A Walmart AI voice agent was bypassed with classic prompt injection to reach a humanMedium Feb 2026
Data Leakage
1 on this surface
1 Medium
Runtime control OmniGuard redacts inline. Prism observes the model's representations to flag identity-bound content before it reaches a response. AIDR provides the audit trail.
- Huq location data transmitted by apps despite user opt outsMedium Dec 2021

Where this surface bites hardest

See how Realm catches these failure modes at runtime, before they reach a user.

Book a Demo

AI Voice Agent failure assessment

Voice Agent failure surface

Email me this assessment