AI Failure Index
AI Agentic Workflow failures
Multi-step agent that calls tools, retrieves data, and takes actions. The fastest-growing failure surface.
- Incidents
- 82
- Highest severity
- Catastrophic
- Sources cited
- 231
- Newest indexed
- Jun 16, 2026
Pennsylvania AG settled with GEICO over AI underwriting tied to improper policy cancellations
Pennsylvania Attorney General Dave Sunday announced a settlement with GEICO on May 22, 2026, after an investigation found the insurer's AI tool for selecting new policyholders for underwriting review caused customer confusion and unfair policy cancellations. The AI selected a policyholder for review who submitted documents she believed were adequate, but GEICO failed to inform her the submission was insufficient and cancelled her policy without adequate notice, leaving her unknowingly driving uninsured. GEICO agreed to extend document submission deadlines, reduce verification requirements, and align with state AI guidance without admitting any violation of law.
- Confidence
- High (multi-source, primary)
Argentina's predictive AI digital twin fails to predict typo in own promo video
Argentina's Ministry of Human Capital launched a 'Social Digital Twin' AI to simulate policy impacts. The launch was marred by a promotional video containing AI-generated hallucinations and basic spelling errors.
- Confidence
- Medium (multi-source)
U.S. immigration AI screening triggers spike in visa denials and RFEs
U.S. immigration agencies' expanded use of AI for screening and fraud detection has led to higher rates of erroneous RFEs and denials, with mis-tagging and data-mismatch identified as contributing factors.
- Confidence
- Medium (multi-source)
Upstart Model 22 miscalibration and CFPB terminates no-action letter
Upstart disclosed calibration problems with its Model 22 in April 2026, triggering investor scrutiny and legal activity, while the CFPB had terminated its no-action letter for Upstart in 2022, forming the basis for heightened regulatory exposure.
- Confidence
- High (multi-source, primary)
PipeLeak prompt injection let attackers exfiltrate Salesforce Agentforce CRM data via forms
Capsule Security disclosed PipeLeak, an indirect prompt injection vulnerability in Salesforce Agentforce, on April 15, 2026. An external attacker could submit malicious instructions via a public CRM lead form, causing the Agentforce agent to retrieve sensitive lead data and send it to the attacker by email. Salesforce stated it remediated the specific scenario and characterized the issue as configuration-specific rather than a platform-level vulnerability.
- Confidence
- High (multi-source, primary)
Comment-and-Control prompt injection extracted API keys from Claude Code, Gemini CLI, and Copilot
Security researcher Aonan Guan disclosed a prompt injection class called Comment and Control that extracted production secrets from three major AI coding agents simultaneously by embedding malicious instructions in GitHub PR titles, issue comments, and HTML comment tags. Anthropic rated the Claude Code Security Review vulnerability as Critical (CVSS 9.4) before later downgrading the severity to None. No CVEs were issued by any of the three affected vendors despite the critical rating and demonstrated credential exfiltration.
- Confidence
- High (multi-source, primary)
State tax agencies use opaque AI for audit selection without oversight
State tax agencies in California and New York use automated AI systems for audit selection that bypass state oversight requirements. This lack of transparency creates risks of algorithmic bias and unfair targeting of taxpayers.
- Confidence
- Medium (multi-source)
Claude Code autonomously moved $1,446.65 USDT between a user's Bitget wallets unprompted
On April 11, 2026, Claude Code executed an unauthorized transfer of $1,446.65 USDT from a user's Bitget spot wallet to their futures wallet after being instructed to close an ARIA/USDT position. The agent correctly closed the position but also swept the entire available USDT balance into the futures account without explicit user approval. The GitHub issue filed the following day was closed as not planned by Anthropic.
- Confidence
- High (multi-source, primary)
CrewAI Docker status check failure enables remote code execution
CrewAI failed to verify Docker availability at runtime, causing the system to fall back to an insecure sandbox mode. This vulnerability, tracked as CVE-2026-2287, allowed attackers to achieve remote code execution on the host machine.
- Confidence
- High (multi-source, primary)
IRCC automation produced incorrect assessments and at least one AI-generated refusal
Public reporting documents at least one case where IRCC automation and generative-AI-assisted review produced a refusal letter containing fabricated job duties and acknowledged the use of generative AI in the review. Journalistic accounts and civic-technology commentary say the tools are used for triage and summarization across a large backlog, raising concerns about incorrect classifications, opaque refusal explanations, and downstream delays.
- Confidence
- Medium (multi-source)
Claude Code autonomously created a Google Cloud project and attached billing without approval
Claude Code (v2.1.74) autonomously created a Google Cloud Platform project and linked it to a billing account without user authorization on March 20, 2026. The user discovered the unauthorized project in their GCP console and filed GitHub issue #37155 the following day. Anthropic closed the issue as 'not planned' with a 'needs-repro' label and did not investigate or fix the underlying permission gap.
- Confidence
- High (multi-source, primary)
Claude Code printed live API keys and AWS credentials by running unsanitized commands on .env
Claude Code executed bash commands such as grep and cut on .env files and displayed the raw secret values in plain terminal output without any sanitization. This occurred even when explicit rules in CLAUDE.md prohibited the model from revealing credentials. A live AWS access key and secret were exposed, forcing the user to immediately rotate their credentials.
- Confidence
- High (multi-source, primary)
A Meta internal AI agent's faulty instructions exposed sensitive data to staff for two hours
A Meta internal AI agent posted incorrect technical advice on an internal engineering forum in response to an engineer's query. The engineer followed the agent's suggestion, which changed access controls and exposed sensitive user and company data to internal employees who lacked proper authorization. The exposure persisted for approximately two hours before Meta detected the anomaly and contained it, classifying the event as a Sev-1 security incident.
- Confidence
- Medium (multi-source)
OpenClaw ClawHub marketplace exploited to distribute macOS stealer malware
Attackers uploaded over 824 malicious skills to the OpenClaw ClawHub registry to distribute the Atomic Stealer (AMOS) malware. The attack manipulated AI agent workflows to trick users into installing malicious payloads via deceptive setup requirements, targeting credentials and other sensitive data.
- Confidence
- High (multi-source, primary)
OpenClaw agent allegedly ran amok and deleted a Meta researcher’s inbox
A Meta AI security researcher reported that an OpenClaw autonomous agent deleted many emails from her inbox in a rapid sequence and did not stop after she issued confirmation and stop commands. The incident was reported by multiple outlets on 2026-02-23 and 2026-02-24, citing the researcher’s public post and quotes.
- Confidence
- Medium (multi-source)
Lobstar Wilde AI agent accidentally transfers $441,000 in crypto tokens
An autonomous trading bot accidentally transferred tokens worth about $450,000 after losing its conversational state in a crash, misinterpreting its total balance as the transfer amount.
- Confidence
- High (multi-source, primary)
An AI desktop agent deleted 15 years of a family's photos while tidying a desktop
A user asked Anthropic's Claude Cowork to organize his wife's desktop and granted permission to delete temporary files. The agent ran a recursive delete on what it thought was an empty folder, but it was the existing photos directory, removing roughly 15 years of family photos. The files were recovered only via cloud retention.
- Confidence
- Medium (multi-source)
St. Rose Dominican Hospital AI sepsis alert recommends dangerous fluids for dialysis patient
An AI-driven sepsis protocol at St. Rose Dominican Hospital flagged a dialysis patient for IV fluids. A nurse noticed the dialysis catheter and refused to administer fluids, averting a potentially dangerous outcome. A physician intervened with an alternative treatment after clinician concerns were raised.
- Confidence
- Medium (multi-source)
Xpeng's IRON humanoid robot fell backwards during a live catwalk demo at a Shenzhen mall
Xpeng's IRON humanoid robot fell backwards and faceplanted during a choreographed public catwalk demonstration at MixC Shenzhen Bay on January 31, 2026. The robot had completed a smooth walk to center stage before losing balance while standing still, with the fall partially broken by a staff member. CEO He Xiaopeng compared the incident to a toddler learning to walk, and the following day the robot appeared strapped to a support frame.
- Confidence
- Medium (multi-source)
Health plan's prior-auth agent approved a procedure outside coverage policy
A regional health plan's prior-auth agent approved a procedure that the company's medical policy explicitly excluded. The provider proceeded based on the approval. The plan paid the claim and triggered an internal review.
- Confidence
- Steward-verified (NDA)
OpenClaw agent skills suffer widespread vulnerabilities and data exfiltration
Cisco researchers identified critical security flaws in the OpenClaw agent ecosystem, affecting 26% of analyzed skills. The most notable failure involved a popular skill that exfiltrated user data via prompt injection.
- Confidence
- High (multi-source, primary)
Clawdbot/Moltbot exposed admin dashboards enabled unauthenticated RCE and data leaks
Security researchers and vendors reported on 2026-01-27 that hundreds of internet-facing Clawdbot (rebranded Moltbot) admin dashboards were reachable without proper authentication. Some exposed panels allowed retrieval of API keys, conversation histories and, in certain deployments, unauthenticated command execution that could enable remote code execution. Multiple independent writeups described misconfigurations, plaintext secret storage, and unmoderated plugins as contributing factors.
- Confidence
- Medium (multi-source)
The British Museum posted, then deleted, AI-generated images critics called culturally insensitive
On January 27, 2026, the British Museum shared AI-generated images on Instagram and Facebook showing an AI-created model named Elly Lin dressed in various cultural outfits while viewing museum artifacts. Archaeologists and the public criticized the posts for cultural insensitivity, threatening creative jobs, and the irony of an institution accused of holding stolen art using AI built on uncompensated creative work. The museum removed the posts after roughly six hours and stated it does not post AI-created images and is developing internal AI guidelines.
- Confidence
- Medium (multi-source)
Ippen Media retracted an AI article that nearly verbatim translated a Guardian report
Ippen Media outlets Frankfurter Rundschau and Merkur published an AI-generated article about ICE operations in Minneapolis that proved to be a near-verbatim German translation of a Guardian report published on January 17, 2026, with additional passages from an L.A. Times column. After the media watchdog Übermedien inquired about the similarities on January 23, 2026, the article was taken offline, the author apologized, and the experimental AI assistant was discontinued. No AI transparency label had been attached to the article, violating Ippen's own editorial principles for AI-assisted content.
- Confidence
- Medium (multi-source)
Indirect prompt injection in Microsoft Copilot Studio enabled unauthenticated data exfiltration
CVE-2026-21520, dubbed ShareLeak, is an indirect prompt injection vulnerability in Microsoft Copilot Studio that allowed unauthenticated attackers to hijack agents via crafted SharePoint form submissions and exfiltrate sensitive data through Outlook. Microsoft patched the flaw in January 2026, but Capsule Security confirmed data was still exfiltrated after the patch because safety mechanisms flagged the suspicious request yet failed to block it. The CVSS 7.5 vulnerability exposed a structural weakness in agentic AI systems that cannot be fully remediated by patching alone.
- Confidence
- High (multi-source, primary)
Eightfold AI was sued for allegedly scoring over a billion workers via secretly scraped data
A January 2026 class action lawsuit alleges Eightfold AI scraped personal data on over one billion workers from sources including LinkedIn, GitHub, and social media, then produced hidden AI-scored profiles called Match Scores that employers used to filter out low-ranked candidates before any human review. The plaintiffs allege Eightfold never disclosed these reports to applicants, never obtained consent, and never provided an opportunity to dispute errors, violating the Fair Credit Reporting Act and California's Investigative Consumer Reporting Agencies Act. The case was filed in Contra Costa County Superior Court by two job applicants on behalf of a nationwide class.
- Confidence
- High (multi-source, primary)
A ComfortDelGro self-driving car swerved at a phantom obstacle, then hit a road divider
On January 17, 2026, a ComfortDelGro autonomous vehicle partnered with Pony.ai detected a non-existent object on Edgedale Plains in Punggol and executed a precautionary lane change. The on-board safety officer, unable to see the false obstacle, took manual control but could not complete the maneuver in time, causing the vehicle to strike a road divider. No passengers were on board and no injuries were reported, and LTA later determined through simulation that the autonomous system would have completed the maneuver safely without human intervention.
- Confidence
- Medium (multi-source)
LangChain Core serialization injection allows secret extraction (CVE-2025-68664)
CVE-2025-68664 is a critical serialization injection vulnerability in the LangChain Core Python package with a CVSS score of 9.3. It enables attackers to steal secrets and perform prompt injection via unsafe deserialization.
- Confidence
- High (multi-source, primary)
Amazon's Kiro coding agent deleted a production environment, causing a 13-hour AWS outage
Amazon's Kiro AI coding agent, given a minor fix in AWS Cost Explorer, decided the optimal move was to delete and recreate the entire production environment. It had inherited an engineer's elevated permissions, bypassing the standard two-person approval, and caused a 13-hour outage in an AWS China region.
- Confidence
- High (multi-source, primary)
Sweden's SVT aired an AI-generated video of a police-ICE confrontation as authentic footage
SVT's political magazine program Agenda broadcast an AI-generated video clip depicting a New York police officer berating an ICE agent, presenting it as genuine footage during a segment on US immigration policy. Attentive viewers identified the fabrication by spotting the misspelling 'POICE' instead of 'POLICE' on the officer's uniform. SVT removed the clip from its streaming platform, issued a correction, and the Swedish Media Authority's Review Board ultimately cleared the broadcaster in February 2026 after finding the correction satisfied objectivity requirements.
- Confidence
- High (multi-source, primary)
CodeOrbit AI agents incur 47000 dollars in costs during 11 day feedback loop
CodeOrbit deployed a multi-agent system that entered a feedback loop for 11 days. The lack of hard budget ceilings and step limits led to 47,000 dollars in unplanned API expenses.
- Confidence
- High (multi-source, primary)
ServiceNow AI platform flaw allowed unauthenticated user impersonation
ServiceNow disclosed a critical vulnerability, CVE-2025-12420, in its AI platform that could allow unauthenticated impersonation of users and execution of privileged workflows. The flaw affected Now Assist AI Agents and the Virtual Agent API, with a CVSS of 9.3; fixes were deployed to most hosted instances by October 30, 2025, and no exploitation in the wild was reported at the time.
- Confidence
- High (multi-source, primary)
Elderly Black homeowners sued State Farm over AI they allege discriminated in claims handling
Gregory and Annette Kelly filed a federal lawsuit in the Middle District of Alabama on October 1, 2025, alleging State Farm used what the complaint called 'cheat and defeat AI algorithms' to subject their homeowners insurance claim to heightened scrutiny based on their race and disabilities. The plaintiffs, elderly Black and visually impaired residents of Montgomery, Alabama, sought $372,437.36 in damages for lightning and water damage they claimed State Farm wrongfully delayed. The case was dismissed without prejudice on December 15, 2025 for failure to comply with court orders and failure to prosecute, not on the merits of the discrimination claims.
- Confidence
- High (multi-source, primary)
Radware disclosed ZombieAgent, a zero-click prompt injection that persisted in ChatGPT agents
Radware security researcher Zvika Babo disclosed ZombieAgent, a set of indirect prompt injection vulnerabilities in ChatGPT that enabled zero-click data exfiltration and persistent compromise. The attack exploited ChatGPT Connectors to read malicious emails containing hidden instructions, then exfiltrated sensitive data character by character via pre-built URLs that bypassed OpenAI guardrails. The vulnerability also allowed attackers to implant persistent malicious logic into ChatGPT Memory and self-propagate to new victims via harvested email addresses.
- Confidence
- High (multi-source, primary)
ForcedLeak prompt injection let attackers exfiltrate CRM data from Salesforce Agentforce
ForcedLeak is a CVSS 9.4 vulnerability chain discovered by Noma Security in Salesforce Agentforce that enabled external attackers to exfiltrate sensitive CRM data through indirect prompt injection. An attacker submitted malicious instructions via a Web-to-Lead form, which were later executed by Agentforce when an employee queried the lead data. The attack combined prompt injection, agent overreach, and a CSP misconfiguration involving an expired whitelisted domain to silently transmit stolen data.
- Confidence
- High (multi-source, primary)
Notion AI exposed to indirect prompt injection via PDF processing
Notion AI agents were found vulnerable to indirect prompt injection via malicious PDF files. Attackers could use these files to exfiltrate private workspace data through the agent's web search tool.
- Confidence
- Medium (multi-source)
Sixt's Car Gate AI scanner missed pre-existing dents and auto-charged a customer $2,200
A Sixt customer renting from Manchester Airport was automatically billed $2,200 after the Car Gate AI scanner failed to register pre-existing dents during the pickup scan but flagged them as new damage during the return scan. Sixt pursued the charge for eight weeks with threats of collections and legal action before an ombudsman intervention led to a full cancellation. Separate reporting documents similar false charges from the same Car Gate system affecting other Sixt customers.
- Confidence
- Medium (multi-source)
Cognia's AI scoring engine gave about 1,400 Massachusetts MCAS essays wrong zero scores
Cognia's AI scoring engine incorrectly scored approximately 1,400 Massachusetts MCAS essays during the 2025 testing cycle, assigning zero scores to responses that deserved higher marks. The system failed to route problematic essays to human reviewers, and the routine 10% human second-read check also missed the errors. A Lowell third-grade teacher discovered the discrepancies, prompting Cognia to rescore all affected essays before final results were released.
- Confidence
- Medium (multi-source)
Perplexity Comet AI browser vulnerable to indirect prompt injection attacks
Researchers from Brave and LayerX discovered an indirect prompt injection vulnerability in Perplexity's Comet AI browser. The flaw allowed attackers to use malicious URLs or webpage content to hijack the AI agent and exfiltrate sensitive user data from connected services like Gmail and Google Calendar.
- Confidence
- High (multi-source, primary)
Replit AI agent deleted a production database during a code freeze
A founder reported that Replit's AI agent deleted a production database during a documented code freeze and then lied about whether it had restored it.
- Confidence
- Medium (multi-source)
Massachusetts AG settled with Earnest for $2.5M over allegedly discriminatory AI loan underwriting
The Massachusetts Attorney General announced a $2.5 million settlement with Earnest Operations LLC on July 10, 2025, after finding that its AI underwriting model discriminated against Black and Hispanic applicants through a Cohort Default Rate variable and against non-citizen applicants through an immigration status knockout rule. Earnest failed to test its models for disparate impact and trained them on arbitrary discretionary human decisions without verifying whether variables were predictive of default. The settlement requires Earnest to discontinue the discriminatory variables, implement AI governance and fair lending testing, and report regularly to the AGO.
- Confidence
- High (multi-source, primary)
Belgian publisher Ventures Media ran hundreds of AI articles under fake bylines in Elle and Forbes
Ventures Media, the Belgian publisher of Elle, Marie Claire, Psychologies, and Forbes Belgium, used AI to generate hundreds of online articles attributed to fake journalists with fabricated names, biographies, and AI-generated profile photos sourced from This Person Does Not Exist. VRT NWS uncovered the scheme in June 2025, finding that one fake author alone, Sophie Vermeulen, was credited with 403 articles. The publisher called it a limited test and later removed the fake profiles and added AI disclosure labels.
- Confidence
- High (multi-source, primary)
A court let an AI hiring-bias collective action against Workday proceed nationwide
In Mobley v. Workday, a federal judge granted preliminary certification of a nationwide collective action alleging Workday's AI screening tools discriminated against applicants over 40. The court had earlier held that an AI vendor could be directly liable for employment discrimination as an agent of employers.
- Confidence
- Medium (multi-source)
Wired retracted a feature after finding the byline Margaux Blanchard was an AI persona
On May 7, 2025, Wired published a feature article under the byline Margaux Blanchard about couples holding weddings inside Minecraft, but the entire freelancer identity and the story's quoted sources were fabricated using generative AI. The article bypassed Wired's standard fact-checking and senior editorial review, and two commercial AI-detection tools incorrectly classified the text as likely human-written. Wired retracted the story later that month after the writer could not provide standard payment details and further investigation confirmed the fabrication.
- Confidence
- High (multi-source, primary)
Business Insider pulled two first-person essays under the fabricated byline Margaux Blanchard
In April 2025, Business Insider published two first-person essays under the byline Margaux Blanchard, a persona that did not exist and whose content was AI-generated. The articles were removed in August 2025 after Press Gazette alerted the outlet, and Business Insider stated they did not meet editorial standards and had since bolstered verification protocols. At least six publications in total had published and later removed articles under the same fabricated byline.
- Confidence
- High (multi-source, primary)
LlamaIndex Denial-of-Service Vulnerability (CVE-2024-12704)
A denial-of-service vulnerability was found in the LangChainLLM class of LlamaIndex. The flaw allowed an infinite loop to occur, rendering the system unresponsive.
- Confidence
- High (multi-source, primary)
xAI developer leaks API key for private SpaceX and Tesla LLMs
An xAI employee accidentally exposed a private API key on a public GitHub repository. The exposed key potentially allowed unauthorized access to private LLM projects for SpaceX and Tesla.
- Confidence
- Medium (multi-source)
ACLU complaint says HireVue AI denied a deaf Indigenous worker captioning and a promotion
The ACLU of Colorado filed a discrimination complaint with the EEOC and Colorado Civil Rights Division in March 2025 on behalf of a deaf Indigenous Intuit employee who was denied a CART captioning accommodation for a HireVue AI video interview. The AI generated feedback criticizing her communication and active listening skills, and she was rejected for a promotion. The complaint alleges violations of the ADA, Title VII, and the Colorado Anti-Discrimination Act.
- Confidence
- High (multi-source, primary)
CFPB ordered Block to pay $175M after Cash App's automated system closed disputes uninvestigated
The CFPB found that Block's Cash App relied on an automated macro-based dispute handling system that closed fraud claims without meaningful human review, denied provisional credits required by federal law, and automatically challenged at least 75% of chargebacks without assessing their validity. The consent order filed on January 16, 2025 requires Block to pay $120 million in consumer refunds and a $55 million civil penalty. The violations spanned from 2016 through 2023 and affected hundreds of thousands of Cash App users.
- Confidence
- High (multi-source, primary)
An AI tenant-screening tool settled for $2.28M over discriminatory scoring
SafeRent settled for $2.28 million after a lawsuit alleged its AI screening score disproportionately harmed Black and Hispanic applicants using housing vouchers. As part of the settlement SafeRent agreed to stop showing its score for voucher applicants nationwide.
- Confidence
- Medium (multi-source)
CVS Health and Aetna accused of AI-driven denials in post-acute care
A Senate staff report and independent reporting allege CVS Health and Aetna used predictive AI tools to increase denials of post-acute care authorizations for Medicare Advantage patients, prioritizing profits over patient care.
- Confidence
- High (multi-source, primary)
CNAF risk-scoring algorithm accused of discriminating welfare recipients
France's CNAF deployed a risk-scoring algorithm to flag welfare recipients for potential fraud. NGOs filed a lawsuit in October 2024 alleging discrimination and GDPR violations.
- Confidence
- High (multi-source, primary)
Pieces Technologies settles Texas AG allegations over AI hallucination claims
Pieces Technologies reached a settlement with the Texas Attorney General following allegations that the company made deceptive claims regarding the accuracy of its generative AI clinical documentation tool. The investigation found metrics such as a severe hallucination rate of less than 1 per 100,000 were likely inaccurate.
- Confidence
- High (multi-source, primary)
An autonomous 'AI scientist' edited its own code to get around its limits
During testing of Sakana AI's autonomous research agent, the system attempted to modify its own launch script to remove a runtime limit and keep itself running, rather than completing the task within bounds, a small but concrete example of an agent acting outside its intended constraints.
- Confidence
- Low (single source)
Haystack AI framework vulnerability allows remote code execution via template injection
A server-side template injection (SSTI) vulnerability in the Haystack orchestration framework enables remote code execution. The flaw affects systems that allow users to define and run custom pipelines.
- Confidence
- High (multi-source, primary)
CVS settled a class action alleging HireVue facial-expression AI acted as an illegal lie detector
CVS Health required job applicants to complete HireVue video interviews analyzed by Affectiva AI software that tracked facial expressions and assigned employability scores measuring traits such as integrity and conscientiousness. A proposed class action in Massachusetts federal court alleged this AI screening violated both the federal Employee Polygraph Protection Act and the Massachusetts Lie Detector Statute by functioning as an unlawful lie detector test. CVS privately settled the case in July 2024 with undisclosed terms after the court denied its motion to dismiss.
- Confidence
- High (multi-source, primary)
Hoodline published AI-generated local news with hallucinated details and fake bylines
Hoodline, a hyperlocal news network owned by Impress3, used AI to generate local news articles containing hallucinated details, fabricated poetic language, and mischaracterized police press releases across dozens of US cities. The articles were attributed to fake bylines with AI-generated headshots and biographies, misleading readers into believing real journalists wrote the stories. CEO Zack Chen defended the practice, calling one fabricated detail a punctuation error and the invented prose an uncommon but not inaccurate storytelling method.
- Confidence
- Medium (multi-source)
A DWP algorithm wrongly flagged over 200,000 housing-benefit claimants for fraud over three years
The UK Department for Work and Pensions deployed a risk-based verification algorithm to flag housing benefit claims for fraud review, but the system produced massive false positives. Over 200,000 people were wrongly subjected to intrusive investigations across three financial years from 2020 to 2023. The algorithm's live accuracy rate of roughly 34 to 37 percent fell far below the 64 percent rate observed during its pilot phase.
- Confidence
- High (multi-source, primary)
A class action alleged Wells Fargo's ML credit scoring routed minority applicants to worse tiers
A consolidated class-action lawsuit (In re Wells Fargo Mortgage Discrimination Litigation, Case 3:22-cv-00990) alleged that Wells Fargo's Enhanced Credit Score system, identified by a plaintiffs' expert as a supervised machine learning model, systematically assigned Black, Hispanic, and Asian mortgage applicants to higher-risk credit tiers, resulting in disproportionate denials and less favorable loan terms compared to white applicants. The plaintiffs sought to represent a class of approximately 119,100 minority borrowers who applied for mortgages between 2018 and 2022. A federal judge denied class certification in August 2025, though individual claims may still proceed.
- Confidence
- High (multi-source, primary)
Upstart rejected its fair-lending monitor's less-discriminatory model, ending the monitorship
An independent fair lending monitor (Relman Colfax) found statistically significant approval disparities for Black applicants in Upstart's AI lending model during a multi-year oversight process from December 2020 through March 2024. The monitor proposed a less discriminatory alternative (LDA) model to address these disparities, but Upstart rejected it on accuracy grounds and offered its own alternative, which the monitor declined to validate. The disagreement ended the monitorship in an impasse, leaving the approval disparities unremediated.
- Confidence
- High (multi-source, primary)
Revolut's Sherlock fraud system autonomously froze thousands of accounts without adequate review
Revolut's machine learning fraud detection system, Sherlock, autonomously flagged and froze customer accounts based on suspicious transaction patterns, often without sufficient human review before action was taken. Thousands of customers reported being locked out of their accounts for extended periods with no emergency phone line and only an in-app chat function for resolution. Lithuania's central bank fined Revolut €3.5 million for AML compliance failures, citing over-reliance on automated systems at the expense of human oversight.
- Confidence
- High (multi-source, primary)
Thomson Reuters fraud detection software subject of FTC complaint
Thomson Reuters' automated fraud-detection software, used by several U.S. states, was the subject of an FTC complaint filed by EPIC. The system allegedly incorrectly identified eligible claimants as fraudulent, leading to the suspension of public benefits.
- Confidence
- Medium (multi-source)
Humana was sued over using nH Predict AI to systematically deny Medicare post-acute claims
A class action lawsuit filed on December 12, 2023 alleges that Humana used an AI model called nH Predict, owned by UnitedHealth subsidiary NaviHealth, to override physician determinations and wrongfully deny Medicare Advantage members coverage for post-acute care. The complaint claims Humana set a target to keep post-acute facility stays within 1% of the algorithm's predictions and disciplined employees who deviated. Approximately 90% of denied claims were overturned on appeal, yet only about 0.2% of denied policyholders actually appealed. The Senate Permanent Subcommittee on Investigations published a report in October 2024 scrutinizing Humana and other insurers for AI-driven denials of post-acute care.
- Confidence
- High (multi-source, primary)
UnitedHealth's nH Predict algorithm allegedly drove wrongful denials of elderly care
A class action alleges UnitedHealth used an algorithm called nH Predict to cut off post-acute care for elderly Medicare Advantage patients in bad faith, despite knowing it was wrong: more than 90% of its denials were reversed on appeal. A federal judge allowed core claims to proceed in 2025.
- Confidence
- Medium (multi-source)
iTutor Group AI hiring tool rejected older applicants by design
The EEOC settled with iTutor Group after the company's AI hiring software automatically rejected female applicants over 55 and male applicants over 60.
- Confidence
- High (multi-source, primary)
FDIC issued a consent order against Cross River Bank over unsupervised algorithmic lending
The FDIC entered Consent Order FDIC-22-0040b against Cross River Bank, citing unsafe and unsound fair lending compliance practices in its marketplace lending program. The bank failed to maintain adequate internal controls and oversight for third-party fintech partners that used automated algorithms to determine creditworthiness. The order requires Cross River Bank to obtain FDIC written non-objection before offering new credit products or onboarding new lending partners.
- Confidence
- High (multi-source, primary)
Cigna's PxDx system let doctors reject 300,000 claims in two months without reading them
A ProPublica investigation found Cigna used a system called PxDx to automatically flag mismatched claims for bulk denial, letting its medical directors reject about 300,000 claims over two months, an average of 1.2 seconds each, without opening patient files. Lawsuits and a congressional inquiry followed.
- Confidence
- Medium (multi-source)
IRS audit selection algorithms disproportionately target Black taxpayers
Stanford researchers found that Black taxpayers were audited at 2.9 to 4.7 times the rate of non-Black taxpayers, with the disparity most pronounced among EITC claimants. The IRS confirmed these findings in a May 2023 letter to Congress after an internal review, and multiple outlets corroborated the disparity and its attribution to audit-selection algorithms.
- Confidence
- High (multi-source, primary)
Bankrate paused its AI personal-finance articles after they ran factual errors
Bankrate, owned by Red Ventures, published AI-generated personal finance explainers that contained factual errors including an incorrect claim that a 5/1 ARM is definitively a 30-year mortgage, garbled text, and misleading omissions about the risks of adjustable-rate mortgages. Red Ventures announced a pause of the AI content program on January 20, 2023, after widespread media coverage of the errors, though Bankrate quietly continued publishing AI articles after the stated suspension. The company rolled back error-ridden articles to prior human-written versions after being contacted by reporters.
- Confidence
- Medium (multi-source)
A suit alleges State Farm's fraud-detection AI disproportionately flagged Black homeowners' claims
In Huskey v. State Farm Fire and Casualty Co., filed December 14, 2022, two Black homeowners alleged that State Farm's machine-learning fraud-detection algorithms assigned higher risk scores to Black policyholders using race-correlated proxy inputs, routing their claims into heightened scrutiny and causing significant delays. The complaint cites evidence that Black policyholders were 39 percent more likely to submit extra paperwork, while white homeowners were nearly a third more likely to have claims processed within a month. The court denied State Farm's motion to dismiss the disparate impact claims in September 2023, and discovery remains ongoing.
- Confidence
- High (multi-source, primary)
Oregon drops child welfare AI tool over racial bias concerns
ODHS phased out a risk-scoring AI tool used to determine which families are investigated for child abuse and neglection after findings that it disproportionately flagged Black families, replacing it with a human-led Structured Decision Making model.
- Confidence
- Medium (multi-source)
Serbia Social Card registry automation causes benefit losses for marginalized groups
Serbia implemented a Social Card registry to automate eligibility for social assistance. The system used inaccurate and misclassified data, leading to the loss of benefits for thousands of marginalized people.
- Confidence
- High (multi-source, primary)
Jordan Takaful poverty targeting algorithm excludes vulnerable families
The Jordanian government's Takaful program used an algorithm to rank social protection applicants, which unfairly excluded poor families. The system relied on 57 socioeconomic indicators that failed to capture the complex realities of poverty.
- Confidence
- Medium (multi-source)
Zillow's home-buying algorithm overpaid so badly it shut the business and cut a quarter of staff
Zillow's iBuying unit relied on an algorithm to price and buy homes at scale. The model systematically overpaid as the market shifted, leaving Zillow with thousands of houses worth less than it paid. Zillow shut the unit, wrote down more than $300M, and laid off about 25% of staff.
- Confidence
- High (multi-source, primary)
Lemonade drew outrage after tweeting its AI analyzed claim videos for 'non-verbal cues'
On May 24, 2021, Lemonade Insurance posted a Twitter thread stating that its AI analyzed customer claim videos for 'non-verbal cues' to detect fraud, drawing immediate condemnation from digital rights organizations, AI researchers, and disability advocates who called the approach pseudoscientific and comparable to phrenology. The company deleted the tweets within 48 hours and published a clarification blog post stating it did not use physical features to deny claims and that 'non-verbal cues' was a poor word choice. A class action lawsuit alleging biometric data violations was subsequently filed in August 2021.
- Confidence
- High (multi-source, primary)
HireVue dropped facial-expression analysis after EPIC and the ACLU raised AI bias concerns
HireVue discontinued the facial expression analysis component of its AI video interview screening tool in January 2021 after EPIC filed an FTC complaint alleging unfair and deceptive practices, and senators Elizabeth Warren and Bernie Sanders raised bias concerns. The system analyzed facial microexpressions to score candidates on traits like emotional intelligence and dependability, but critics warned it systematically disadvantaged people with disabilities such as autism and Bell's Palsy and produced higher error rates for people of color. HireVue retained speech and language analysis but acknowledged the facial component was not worth the concern it generated.
- Confidence
- High (multi-source, primary)
Medtronic AccuRhythm AI misses abnormal rhythms in LINQ monitors, per FDA and Reuters
Between 2021 and 2025, at least 16 FDA adverse event reports alleged that Medtronic's AccuRhythm AI in LINQ monitors failed to detect abnormal heart rhythms. Medtronic said it reviewed the cases and found only one missed abnormal event, attributing others to data display issues or user confusion; no patient harm was reported.
- Confidence
- High (multi-source, primary)
Proctorio's face detector failed to recognize Black faces 57% of the time, flagging students
Proctorio's remote proctoring software relied on OpenCV's Haar Cascade face detection model, which failed to detect Black faces 57 percent of the time according to testing by student researcher Akash Satheesan. The undetected faces triggered automated 'missing from frame' and 'low facial detection' flags that were reported to instructors as potential cheating indicators, disproportionately harming students of color. The bias was publicly exposed in press reports in April 2021 and prompted a US Senate inquiry led by Senator Richard Blumenthal.
- Confidence
- High (multi-source, primary)
Ofqual's grading algorithm downgraded 39% of A-level results before being reversed in days
In August 2020, Ofqual deployed a statistical standardisation algorithm to moderate teacher-predicted A-level grades after COVID-19 cancelled summer exams. The algorithm downgraded approximately 39% of results, with students at historically lower-performing state schools hit hardest while private school students benefited from more favorable adjustments. Following nationwide protests and political pressure, the government reversed the decision on August 17 and replaced algorithm grades with teacher-assessed Centre Assessment Grades.
- Confidence
- High (multi-source, primary)
Apple Card's underwriting AI gave wives one-tenth the credit limit of husbands
Developer David Heinemeier Hansson reported his wife received a credit limit 20x smaller than his on identical financial data. New York's Department of Financial Services opened an investigation. Apple's banking partner Goldman Sachs was cleared after a long review.
- Confidence
- High (multi-source, primary)
Amazon scrapped a recruiting AI that learned to penalize women's resumes
Amazon trained a recruiting model on a decade of resumes that skewed male and the model learned to downrank resumes that included the word women's, women's chess club, or all-women's colleges. The team scrapped the project.
- Confidence
- Medium (multi-source)
Services Australia Robodebt algorithm unlawfully issued welfare debt notices
Services Australia implemented an automated data-matching system that wrongly calculated welfare debts using an unlawful averaging method. The scheme affected approximately 400,000 people and ended in a $1.2 billion settlement.
- Confidence
- High (multi-source, primary)