An airline chatbot gave a passenger a wrong refund policy, echoing the Air Canada problem
Passengers reported that airline and travel-agency chatbots continued to state refund and rebooking policies that did not match the carriers' actual rules, a year after the Air Canada tribunal ruling, showing the hallucinated-policy failure mode persisting across the travel industry.
A year after the Air Canada ruling, travel chatbots were still confidently stating refund policies that did not exist.
Key facts
- What
- Passengers reported that airline and travel-agency chatbots continued to state refund and rebooking policies that did not match the carriers' actual rules, a year after the Air Canada tribunal ruling, showing the hallucinated-policy failure mode persisting across the travel industry.
- Incident date
- Feb 10, 2025
- Who
- Air India Express / MakeMyTrip
- Failure mode
- Hallucination
- AI surface
- Chatbot
- Severity
- Medium
What happened
In early 2025, travelers documented airline and online-travel-agency chatbots confidently stating refund, baggage, and rebooking terms that contradicted the carriers' published policies, the same hallucinated-policy failure that produced the Air Canada ruling. The cases reinforced that customer-service bots still invent policy when they lack grounding.
What broke inside the model
- 01 · TriggerA user asks for a fact, a citation, or a figure.
- 02 · Model stepThe model writes a fluent, confident answer.
- 03 · Control gapNothing ties the claim back to a real source.
- 04 · FailureA fabricated fact ships as if it were verified.
- 05 · ConsequenceThe false claim reaches a customer, a court, or the public.
Confidence holds, and even spikes, as the claim detaches from any source.
The system produced fluent, confident output with no grounding in any source. Hallucination is a property of how the model generates, not a bug in one prompt: the most likely next token is not the same as the true one, and nothing in the pipeline compared the answer against a source of truth before it shipped.
What it cost
Customer disputes; reputational risk for carriers
Sources
Cite this entry
https://failureindex.ai/failures/airline-chatbot-gave-passenger-wrong-refundAI Failure Index. "An airline chatbot gave a passenger a wrong refund policy, echoing the Air Canada problem" (FI-0069). Realm Labs. https://failureindex.ai/failures/airline-chatbot-gave-passenger-wrong-refund (indexed Jun 3, 2026).Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0069. Full dataset at /data.
Note from Realm Labs, the Index steward
How Realm would have caught this
- Prism
- OmniGuard
- AI Detection & Response (AIDR)
A runtime layer that watches the model's internal state can flag the moment a model commits to a claim it has no support for, and hold or reroute the response before it reaches a user. Realm reads those signals in real time rather than grading the transcript after the fact.