Lawyers cited six fake cases generated by ChatGPT in federal court
In Mata v. Avianca, two attorneys filed a brief citing six judicial decisions that did not exist. ChatGPT had fabricated them. The court sanctioned the lawyers and the case became the inflection point for legal AI policy.
Twelve hundred AI-hallucinated court citations later, the bar still treats this as a training problem instead of an enforcement problem.
Key facts
- What
- In Mata v.
- Incident date
- May 4, 2023
- Who
- Levidow, Levidow & Oberman
- Failure mode
- Hallucination
- AI surface
- Chatbot
- Severity
- Catastrophic
What happened
In Mata v. Avianca, Inc., two attorneys at Levidow, Levidow & Oberman filed a brief in May 2023 that cited six federal cases. Opposing counsel could not find any of the six. Neither could the court. The cases did not exist. ChatGPT had generated them, and the attorneys had asked ChatGPT to verify the cases. ChatGPT had assured them the cases were real.
On June 22, 2023, Judge P. Kevin Castel sanctioned both attorneys $5,000 each and required them to send copies of the sanctions order to every judge they had named in the fake citations. The order became the founding document of legal AI policy reform in the United States.
By April 2026, researcher Damien Charlotin's public database had cataloged 1,227 separate instances of AI-hallucinated citations submitted to courts worldwide, with new cases added at five to six per day. The Mata sanction did not deter the next 1,200 occurrences.
What broke inside the model
- 01 · TriggerA user asks for a fact, a citation, or a figure.
- 02 · Model stepThe model writes a fluent, confident answer.
- 03 · Control gapNothing ties the claim back to a real source.
- 04 · FailureA fabricated fact ships as if it were verified.
- 05 · ConsequenceThe false claim reaches a customer, a court, or the public.
Confidence holds, and even spikes, as the claim detaches from any source.
ChatGPT was a chatbot, not a research tool. Its output is a probability distribution over plausible text, not a retrieval from a verified case-law database. The model generated plausible-sounding citations that matched the structure of real federal case law. When the attorneys asked it to verify the citations, the model generated plausible-sounding verifications. There is no truth layer in the chatbot interface.
What it cost
$5,000 sanction, reputational, plus the 1,200+ subsequent cases across the bar
Sources
- Court FilingMata v. Avianca, Inc.: 22-cv-1461 (PKC): Sanction Ordercourtlistener.com
- PrimaryAI Hallucination Cases Database: Damien Charlotindamiencharlotin.com
Cite this entry
https://failureindex.ai/failures/mata-v-avianca-fake-legal-citationsAI Failure Index. "Lawyers cited six fake cases generated by ChatGPT in federal court" (FI-0008). Realm Labs. https://failureindex.ai/failures/mata-v-avianca-fake-legal-citations (indexed May 13, 2026).Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0008. Full dataset at /data.
Note from Realm Labs, the Index steward
How Realm would have caught this
- Prism
- OmniGuard
- AI Detection & Response (AIDR)
For a legal-research deployment, Realm sits between the model and the source-of-truth case-law system. Every citation the model produces is checked against an authoritative database before the response renders. If the citation does not exist, OmniGuard either rewrites the response to remove the fabricated reference or blocks the response and surfaces an error. The brief that ends up in court is the one the model could verify, not the one the model was hoping was true.