Public-sector voice agent failed Spanish-accented English callers at 4x the rate of native speakers

A state-government voice agent for benefits eligibility failed Spanish-accented English speakers at four times the rate of native speakers. The fairness audit was prompted by a single state legislator who called.

Anonymized: Public Sector · US · State agency · Incident Nov 4, 2025 · Indexed May 13, 2026 · Steward-verified · NDA

What happened

A state-government voice agent for benefits eligibility was found to misroute or terminate calls from Spanish-accented English speakers at approximately four times the rate of native English speakers. A state legislator who called the line on behalf of a constituent flagged the problem. The agency disabled the agent and re-procured the service.

The case is anonymized but the pattern is widely known among public-sector voice procurement teams. Accent-driven accuracy gaps in voice agents have a direct civil-rights exposure.

What broke inside the model

Failure path · mode profile · Policy Violation

01 · TriggerA prompt pushes against a deployment boundary.
02 · Model stepThe model produces the disallowed output.
03 · Control gapNo enforcement blocks it at generation time.
04 · FailureThe output crosses the policy line.
05 · ConsequenceA limit the business set is breached in public.

The output crosses a policy boundary the deployment had defined.

Speech-to-text accuracy varies by accent. The model's downstream intent classifier is trained on transcripts; if the transcripts are wrong, the intent is wrong; if the intent is wrong, the call gets misrouted. The model is not biased on purpose. The pipeline is biased by composition.

Cite this entry

Permalinkhttps://failureindex.ai/failures/anonymized-fintech-voice-agent-spanish-accent

Citation

AI Failure Index. "Public-sector voice agent failed Spanish-accented English callers at 4x the rate of native speakers" (FI-0020). Realm Labs. https://failureindex.ai/failures/anonymized-fintech-voice-agent-spanish-accent (indexed May 13, 2026).

Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0020. Full dataset at /data.

How Realm would have caught this

Controls for this failure mode

Prism
OmniGuard

Realm reads the agent's behavior distribution across protected-attribute proxies (here, accent) and flags divergences beyond a defined threshold. The audit becomes continuous instead of episodic. The agency catches the gap before the legislator does.

Public-sector voice agent failed Spanish-accented English callers at 4x the rate of native speakers

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Related failures

Meta contractors posed as teenagers to probe rival chatbots with thousands of crisis prompts

Medicare's AI prior-authorization pilot drew a federal reprimand after delays and disputed denials

Procureur général du Canada sanctioned pro se litigant for AI fabricated case law