Wendy's FreshAI drive-thru agent misheard orders and cut customers off mid-sentence

Wendy's deployed FreshAI, a Google Cloud generative AI voice agent, at drive-thru locations beginning with a Columbus, Ohio pilot in June 2023 and expanding to franchisees in 2024. The system frequently misheard orders, cut customers off mid-sentence, failed to process simple customizations like removing a pickle, and interrupted ordering with aggressive upsell suggestions. Customers found the experience so frustrating that some reported permanently driving to farther Wendy's locations that still used human order takers.

Wendy's · Incident Apr 1, 2024 · Indexed Jun 4, 2026 · 3 sources

The voice agent was so eager to upsell and so deaf to customizations that customers literally drove past it to a farther store with a human.
What
Wendy's deployed FreshAI, a Google Cloud generative AI voice agent, at drive-thru locations beginning with a Columbus, Ohio pilot in June 2023 and expanding to franchisees in 2024.
Incident date
Apr 1, 2024
Who
Wendy's
Failure mode
Hallucination
AI surface
Voice Agent
Severity
Medium

What happened

Wendy's partnered with Google Cloud to build FreshAI, a generative AI voice agent for drive-thru ordering, piloting it in Columbus, Ohio in mid-2023 and expanding to franchisee locations in 2024. Customers at AI-equipped locations reported that the system required them to repeat themselves multiple times, cut them off after brief pauses, could not handle basic customizations such as removing a pickle, and interrupted orders with unwanted upsell suggestions. One Reddit user posted that 'the AI order system is bad enough to where I drive to a further different Wendy's now,' and multiple other customers described the AI as 'garbage' and stated they had to request human employees to retake their orders. The complaints were reported in press coverage from The Independent, Delish, and People in early 2025 when Wendy's announced further expansion.

What broke inside the model

Failure path · mode profile · Hallucination
  1. 01 · TriggerA user asks for a fact, a citation, or a figure.
  2. 02 · Model stepThe model writes a fluent, confident answer.
  3. 03 · Control gapNothing ties the claim back to a real source.
  4. 04 · FailureA fabricated fact ships as if it were verified.
  5. 05 · ConsequenceThe false claim reaches a customer, a court, or the public.

Confidence holds, and even spikes, as the claim detaches from any source.

The generative AI voice agent failed at core speech recognition and natural language understanding in noisy drive-thru environments, generating incorrect order entries from misinterpreted speech. Its conversational design triggered premature cutoffs after sub-second pauses and injected unsolicited upsell suggestions, interrupting customers before they could finish speaking. The system could not reliably parse basic customizations, indicating a gap between training data and the linguistic variety of real drive-thru interactions.

Public visibilityMedium
Regulatory exposureNone
Customer impactMany customers
Financial impactUnknown
Time to disclosureWeeks
  1. Press11 Fast Food Chains Using AI Drive-Thrus (Problems & Issues)gocanopy.com
  2. Customer-DisclosedDear Wendy's, PLEASE get rid of AI ordering at the drive thrureddit.com
  3. PressWendy's customers bitter over 'garbage' decision to employ AI botsthe-independent.com
Permalinkhttps://failureindex.ai/failures/wendy-freshai-drive-thru-agent-misheard
CitationAI Failure Index. "Wendy's FreshAI drive-thru agent misheard orders and cut customers off mid-sentence" (FI-0105). Realm Labs. https://failureindex.ai/failures/wendy-freshai-drive-thru-agent-misheard (indexed Jun 4, 2026).
Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0105. Full dataset at /data.

Note from Realm Labs, the Index steward

How Realm would have caught this

Controls for this failure mode
  • Prism
  • OmniGuard
  • AI Detection & Response (AIDR)

A runtime layer that watches the model's internal state can flag the moment a model commits to a claim it has no support for, and hold or reroute the response before it reaches a user. Realm reads those signals in real time rather than grading the transcript after the fact.