AI Failure Index

AI Failures in Healthcare

Healthcare AI failures end in regulatory letters, patient harm, or both. We catalog the public ones.

Incidents
32
Highest severity
Catastrophic
Sources cited
82
Newest indexed
Jun 16, 2026
FI-0592HealthcareMedium
Hallucination

The Doc App counsel files fabricated case law in Florida court

A lawyer representing The Doc App, Inc. used AI to generate court filings that included fake case law. The court flagged the hallucinations and previously sanctioned the attorney, though it declined further sanctions in June 2026.

Confidence
High (multi-source, primary)
The Doc App, Inc.2 sourcesCourt FilingPublicJun 2026
FI-0575HealthcareHigh
Brand & Safety Incident

Social Health Authority AI premiums overcharge poorest Kenyans

Kenya's Social Health Authority deployed an AI-driven predictive model to set health insurance premiums based on income. An investigation found the system systematically overcharged the poorest citizens, effectively denying them access to healthcare.

Confidence
Medium (multi-source)
Social Health Authority3 sourcesPressPublicMay 2026
FI-0482HealthcareHigh
Policy Violation

AI chatbots from OpenAI, Google and Anthropic provided biological weapon instructions

Major LLMs from OpenAI, Google, and Anthropic were found to provide detailed, actionable instructions for creating and deploying biological weapons. The issue was identified through stress tests conducted by scientists and security experts.

Confidence
High (multi-source, primary)
OpenAI, Google, Anthropic3 sourcesPrimaryPublicApr 2026
FI-0324HealthcareHigh
Hallucination

BMJ Open study finds half of leading chatbots give problematic medical advice

A BMJ Open study of five major chatbots found about half produced problematic medical answers, with a notable share being highly problematic due to false balance; this was reiterated by Bloomberg and NBC News.

Confidence
High (multi-source, primary)
OpenAI; Google; xAI; DeepSeek; Meta AI4 sourcesPrimaryPublicApr 2026
FI-0298HealthcareHigh
Agentic Action Error

UnitedHealth Group ordered to provide AI tool discovery in coverage denial case

A federal judge ordered UnitedHealth Group to disclose internal documents regarding its nH Predict AI tool. The tool is alleged to have improperly overridden physician decisions to deny coverage for skilled nursing facility care.

Confidence
Medium (multi-source)
UnitedHealth Group3 sourcesPressPublicMar 2026
FI-0262HealthcareCatastrophic
Data Leakage

Brazilian firm allegedly used AI to illegally resell SUS patient data

In February 2026, the Brazilian Federal Police launched Operation Glycon to dismantle a business structure illegally commercializing sensitive health data from the Unified Health System (SUS). The company allegedly used an AI-powered tool designed for health professionals to gain unauthorized access to clinical records.

Confidence
High (multi-source, primary)
Unnamed company (investigated in Operation Glycon)2 sourcesPrimaryPublicFeb 2026
FI-0189HealthcareHigh
Agentic Action Error

St. Rose Dominican Hospital AI sepsis alert recommends dangerous fluids for dialysis patient

An AI-driven sepsis protocol at St. Rose Dominican Hospital flagged a dialysis patient for IV fluids. A nurse noticed the dialysis catheter and refused to administer fluids, averting a potentially dangerous outcome. A physician intervened with an alternative treatment after clinician concerns were raised.

Confidence
Medium (multi-source)
St. Rose Dominican Hospital2 sourcesPressPublicFeb 2026
FI-0025HealthcareHigh
Agentic Action Error

Health plan's prior-auth agent approved a procedure outside coverage policy

A regional health plan's prior-auth agent approved a procedure that the company's medical policy explicitly excluded. The provider proceeded based on the approval. The plan paid the claim and triggered an internal review.

Confidence
Steward-verified (NDA)
Anonymized: Health Plan · US · regional, 2M+ membersSteward-verified · NDAJan 2026
FI-0419HealthcareMedium
Agentic Action Error

ICE AI resume screening error routes recruits to inadequate training

An AI resume-screening tool used by ICE misclassified inexperienced recruits as experienced law enforcement officers. This resulted in approximately 200 hires receiving inadequate online training instead of the required in-person academy course.

Confidence
Medium (multi-source)
U.S. Immigration and Customs Enforcement (ICE)2 sourcesPressPublicJan 2026
FI-0429HealthcareMedium
Hallucination

HMRC tax allowances ignored by ChatGPT and Copilot

Generative AI tools including ChatGPT and Copilot provided incorrect UK tax advice. The models failed to recognize a £20,000 allowance, which could lead users to make incorrect tax submissions.

Confidence
High (multi-source, primary)
OpenAI, Microsoft2 sourcesPrimaryPublicAug 2025
FI-0259HealthcareHigh
Hallucination

Sonio Detect AI ultrasound software mislabels fetal structures in prenatal imaging

Sonio Detect AI mislabels fetal anatomy in prenatal ultrasound, with a MAUDE adverse event entry and Reuters reporting; Samsung Medison says the FDA report does not indicate a safety issue and no action was requested.

Confidence
High (multi-source, primary)
Samsung Medison (Sonio SAS)2 sourcesPrimaryPublicJun 2025
FI-0466HealthcareHigh
Agentic Action Error

Brazil AI welfare app wrongly rejects benefit claims

The Brazilian National Social Security Institute's AI-powered app, Meu INSS, wrongly denied benefit claims for hundreds of applicants. The system struggled with complex cases and rural users with low digital literacy, leading to a loss of essential income.

Confidence
High (multi-source, primary)
National Social Security Institute (INSS)3 sourcesPrimaryPublicApr 2025
FI-0184HealthcareHigh
Policy Violation

CVS Health and Aetna accused of AI-driven denials in post-acute care

A Senate staff report and independent reporting allege CVS Health and Aetna used predictive AI tools to increase denials of post-acute care authorizations for Medicare Advantage patients, prioritizing profits over patient care.

Confidence
High (multi-source, primary)
CVS Health and Aetna3 sourcesPrimaryPublicOct 2024
FI-0188HealthcareHigh
Hallucination

OpenAI Whisper hallucinations in medical settings prompt safety concerns, AP reports

Independent outlets report that OpenAI Whisper can hallucinate in medical transcription, risking inaccurate patient documentation. The AP investigation notes thousands of healthcare workers use Whisper-based tools, highlighting potential safety concerns in high-risk settings.

Confidence
Medium (multi-source)
OpenAI3 sourcesPressPublicOct 2024
FI-0260HealthcareHigh
Hallucination

Pieces Technologies settles Texas AG allegations over AI hallucination claims

Pieces Technologies reached a settlement with the Texas Attorney General following allegations that the company made deceptive claims regarding the accuracy of its generative AI clinical documentation tool. The investigation found metrics such as a severe hallucination rate of less than 1 per 100,000 were likely inaccurate.

Confidence
High (multi-source, primary)
Pieces Technologies3 sourcesPrimaryPublicSep 2024
FI-0151HealthcareMedium
Policy Violation

CVS settled a class action alleging HireVue facial-expression AI acted as an illegal lie detector

CVS Health required job applicants to complete HireVue video interviews analyzed by Affectiva AI software that tracked facial expressions and assigned employability scores measuring traits such as integrity and conscientiousness. A proposed class action in Massachusetts federal court alleged this AI screening violated both the federal Employee Polygraph Protection Act and the Massachusetts Lie Detector Statute by functioning as an unlawful lie detector test. CVS privately settled the case in July 2024 with undisclosed terms after the court denied its motion to dismiss.

Confidence
High (multi-source, primary)
CVS Health3 sourcesCourt FilingPublicJul 2024
FI-0296HealthcareMedium
Data Leakage

Change Healthcare ransomware incident on Feb 21, 2024 is real but not a production AI failure

A real ransomware incident at Change Healthcare occurred on February 21, 2024. It was not a production AI failure; MFA gaps on remote access were cited as a key root cause, with BlackCat identified as the attackers.

Confidence
High (multi-source, primary)
Change Healthcare (a subsidiary of UnitedHealth Group/Optum)2 sourcesPrimaryPublicFeb 2024
FI-0096HealthcareHigh
Policy Violation

Humana was sued over using nH Predict AI to systematically deny Medicare post-acute claims

A class action lawsuit filed on December 12, 2023 alleges that Humana used an AI model called nH Predict, owned by UnitedHealth subsidiary NaviHealth, to override physician determinations and wrongfully deny Medicare Advantage members coverage for post-acute care. The complaint claims Humana set a target to keep post-acute facility stays within 1% of the algorithm's predictions and disciplined employees who deviated. Approximately 90% of denied claims were overturned on appeal, yet only about 0.2% of denied policyholders actually appealed. The Senate Permanent Subcommittee on Investigations published a report in October 2024 scrutinizing Humana and other insurers for AI-driven denials of post-acute care.

Confidence
High (multi-source, primary)
Humana6 sourcesCourt FilingPublicDec 2023
FI-0425HealthcareHigh
Hallucination

Large language models perpetuate racial bias in healthcare

AIAAIC recorded an incident entry (published November 2023) documenting that large language models (LLMs) have produced racially biased outputs in healthcare contexts. Independent academic audits and studies (including a 2024 audit titled "Unmasking and Quantifying Racial Bias of Large Language Models") found LLMs gave systematically different clinical-related recommendations and projections across racial groups. These outputs have the potential to cause harm when used in clinical decision-making by healthcare deployers.

Confidence
High (multi-source, primary)
Unspecified / healthcare deployer3 sourcesPrimaryPublicNov 2023
FI-0039HealthcareHigh
Brand & Safety Incident

An eating-disorder helpline's chatbot was pulled after giving harmful dieting advice

The National Eating Disorders Association replaced its human helpline with a chatbot named Tessa, which then told users seeking help to count calories and aim for large daily deficits, advice eating-disorder specialists call actively harmful. NEDA took Tessa offline days after launch.

Confidence
Medium (multi-source)
National Eating Disorders Association4 sourcesPressPublicMay 2023
FI-0077HealthcareMedium
Policy Violation

A mental-health startup ran GPT-3 on thousands of unwitting help-seekers

The startup Koko used GPT-3 to co-write responses to roughly 4,000 people seeking peer mental-health support without clearly informing them they were receiving AI-generated messages, drawing an ethics backlash over consent in a vulnerable-population setting.

Confidence
Low (single source)
Koko1 sourcePressPublicJan 2023
FI-0186HealthcareMedium
Policy Violation

Koko used GPT-3 to generate AI-assisted emotional support without informed consent

Koko conducted an October 2022 experiment using GPT-3 to generate emotional support messages, with human editors, affecting about 4,000 users and generating roughly 30,000 messages. The incident became public in January 2023 through reports and statements by Koko’s co-founders, prompting ethical criticism over informed consent and disclosure, and Koko announced pursuing a third‑party IRB review for future changes.

Confidence
Medium (multi-source)
Koko2 sourcesPressPublicOct 2022
FI-0257HealthcareCatastrophic
Hallucination

Acclarent TruDi AI navigation system allegedly causes carotid artery injuries

The Acclarent TruDi AI navigation system allegedly misled surgeons during sinus operations, resulting in carotid artery punctures and strokes. FDA malfunction reports reportedly rose after AI integration in 2021, and two patients filed Texas lawsuits alleging AI contributed to injuries.

Confidence
Medium (multi-source)
Acclarent (Integra LifeSciences)2 sourcesPressPublicJun 2022
FI-0187HealthcareHigh
Policy Violation

Crisis Text Line ends data-sharing with for-profit spinoff Loris.ai

Crisis Text Line admitted to sharing anonymized user data with its for-profit subsidiary, Loris.ai, for machine learning development. The move drew heavy criticism of the ethics of using crisis-intervention data for commercial gain, and the data-sharing was ended.

Confidence
Medium (multi-source)
Crisis Text Line3 sourcesPressPublicJan 2022
FI-0095HealthcareHigh
Hallucination

Epic's sepsis prediction model missed two-thirds of cases with 88% false alarms, a study found

The Epic Sepsis Model, a proprietary sepsis prediction algorithm embedded in Epic's electronic health record platform and deployed at hundreds of US hospitals, was found to miss 67% of sepsis cases while generating 88% false alarms in an independent external validation published in JAMA Internal Medicine in June 2021. The model's discrimination (AUC 0.63) was substantially worse than Epic's claimed performance (AUC 0.76 to 0.83). Epic subsequently overhauled the model in 2022, changing its sepsis definition, reducing reliance on antibiotic orders, and recommending site-specific training before clinical use.

Confidence
High (multi-source, primary)
Epic Systems3 sourcesPrimaryPublicJun 2021
FI-0258HealthcareLow
Hallucination

Medtronic AccuRhythm AI misses abnormal rhythms in LINQ monitors, per FDA and Reuters

Between 2021 and 2025, at least 16 FDA adverse event reports alleged that Medtronic's AccuRhythm AI in LINQ monitors failed to detect abnormal heart rhythms. Medtronic said it reviewed the cases and found only one missed abnormal event, attributing others to data display issues or user confusion; no patient harm was reported.

Confidence
High (multi-source, primary)
Medtronic2 sourcesCourt FilingPublicJan 2021
FI-0360HealthcareHigh
Hallucination

Babylon Health symptom checker alleged to miss or downplay critical symptoms

Multiple news investigations and clinicians' tests in 2019-2021 documented examples where Babylon Health’s symptom checker produced unsafe or inappropriate triage recommendations for serious symptoms. The UK regulator MHRA told a clinician who raised concerns that it shared those concerns, and Babylon acknowledged some errors in examples highlighted by critics.

Confidence
Medium (multi-source)
Babylon Health2 sourcesPressPublicJun 2020
FI-0361HealthcareHigh
Agentic Action Error

Google Health diabetic retinopathy AI fails in real world clinic settings

Google Health's AI for detecting diabetic retinopathy failed to maintain its laboratory accuracy when deployed in real world Indian clinics. The system was hindered by suboptimal environmental conditions and data quality issues.

Confidence
Medium (multi-source)
Google Health2 sourcesPressPublicDec 2019
FI-0363HealthcareHigh
Policy Violation

Study finds Optum risk algorithm understated Black patients' health needs

A 2019 study revealed that Optum's health risk algorithm discriminated against Black patients by substituting health costs for actual health needs. This resulted in a systemic underestimation of risk for Black patients, which limited their access to specialized care management.

Confidence
High (multi-source, primary)
Optum2 sourcesPrimaryPublicOct 2019
FI-0359HealthcareHigh
Hallucination

IBM Watson for Oncology provided unsafe cancer treatment recommendations

IBM Watson for Oncology provided clinically unsafe and incorrect treatment recommendations to healthcare providers. The system allegedly suggested dangerous treatments, such as bleeding drugs for patients with severe hemorrhage.

Confidence
Medium (multi-source)
IBM Watson Health2 sourcesSocialPublicJan 2018
FI-0362HealthcareHigh
Data Leakage

DeepMind and Royal Free NHS Trust process patient records unlawfully

The UK Information Commissioner's Office ruled that DeepMind and the Royal Free NHS Foundation Trust failed to comply with data protection laws. The incident involved the processing of 1.6 million patient records for the Streams app without adequate consent.

Confidence
Medium (multi-source)
DeepMind3 sourcesPressPublicJul 2017
FI-0377HealthcareHigh
Agentic Action Error

Intuitive Surgical da Vinci Xi software anomaly causes unexpected movement

Intuitive Surgical identified a software anomaly in the da Vinci Xi P5 software that could cause unexpected master and instrument tip movements. This led to a global Class 2 FDA recall affecting 677 devices.

Confidence
High (multi-source, primary)
Intuitive Surgical2 sourcesCourt FilingPublicMay 2017