AI Failure Index

AI Tool Misuse failures

Tool misuse is the failure mode below agentic action error. The agent picked the wrong function, or passed parameters that pointed at the wrong target. Sometimes the action goes through and the harm is the same as an agentic action error. Sometimes the tool errors out and the harm is a stuck workflow that masquerades as availability. Either way the agent did not do what the user asked.

Incidents
21
Highest severity
Catastrophic
Sources cited
53
Newest indexed
Jun 16, 2026
FI-0304Public SectorHigh
Tool Misuse

U.S. immigration AI screening triggers spike in visa denials and RFEs

U.S. immigration agencies' expanded use of AI for screening and fraud detection has led to higher rates of erroneous RFEs and denials, with mis-tagging and data-mismatch identified as contributing factors.

Confidence
Medium (multi-source)
U.S. immigration agencies (USCIS / DHS / State Department)2 sourcesPressPublicApr 2026
FI-0570SaaSHigh
Tool Misuse

Anthropic Model Context Protocol vulnerability exposes 200,000 AI servers to RCE

A systemic command injection vulnerability was discovered in Anthropic's Model Context Protocol (MCP). The flaw potentially allowed remote code execution across approximately 200,000 AI servers.

Confidence
High (multi-source, primary)
Anthropic3 sourcesPrimaryPublicApr 2026
FI-0569Cross-industryHigh
Tool Misuse

CrewAI Docker status check failure enables remote code execution

CrewAI failed to verify Docker availability at runtime, causing the system to fall back to an insecure sandbox mode. This vulnerability, tracked as CVE-2026-2287, allowed attackers to achieve remote code execution on the host machine.

Confidence
High (multi-source, primary)
CrewAI3 sourcesPrimaryPublicMar 2026
FI-0255Cross-industryMedium
Tool Misuse

Amity Regional High School AI grading error misread rubric, penalizing a student

A student reported that an AI grading tool at Amity Regional High School misread the rubric for an AP Psychology assignment, interpreting cat least oned as conly oned and receiving a failing grade entered into PowerSchool. The grade was corrected after an academic appeal, and public backlash followed, including a petition to Keep Amity Human; FOIA materials indicated the district spent more on AI tools than initially claimed.

Confidence
Medium (multi-source)
Amity Regional High School (Woodbridge, CT)2 sourcesPressPublicMar 2026
FI-0242Cross-industryCatastrophic
Tool Misuse

OpenClaw ClawHub marketplace exploited to distribute macOS stealer malware

Attackers uploaded over 824 malicious skills to the OpenClaw ClawHub registry to distribute the Atomic Stealer (AMOS) malware. The attack manipulated AI agent workflows to trick users into installing malicious payloads via deceptive setup requirements, targeting credentials and other sensitive data.

Confidence
High (multi-source, primary)
OpenClaw3 sourcesPrimaryPublicFeb 2026
FI-0527Retail & E-commerceMedium
Tool Misuse

Augsburg car dealer uses AI-generated image of burning car to attempt fraud

A car dealer in Augsburg allegedly attempted to defraud a seller by providing an AI-generated image of her car on fire. The dealer claimed previous damages caused a fire to demand a refund while simultaneously listing the undamaged car for sale.

Confidence
High (multi-source, primary)
Augsburg Car Dealer3 sourcesPrimaryPublicFeb 2026
FI-0526Cross-industryLow
Tool Misuse

Remax D’ICI agent uses AI to misleadingly alter home listing photos

A real estate agent at Remax D’ICI used AI to alter a home listing photo in a way the agency later said exceeded acceptable limits in Terrebonne, Quebec. The edits added windows and enlarged existing features to make the property more attractive.

Confidence
Medium (multi-source)
Remax D’ICI3 sourcesPressPublicFeb 2026
FI-0309Cross-industryHigh
Tool Misuse

Tesla Austin robotaxi fleet logs 14 crashes prompting NHTSA investigation

Tesla's robotaxi fleet in Austin recorded 14 crashes over 800,000 miles of operation. This data was disclosed to NHTSA and is part of a broader safety investigation.

Confidence
High (multi-source, primary)
Tesla3 sourcesCourt FilingPublicFeb 2026
FI-0299Cross-industryMedium
Tool Misuse

Adelphi University falsely accused student of AI plagiarism, court rules in his favor

Orion Newby successfully sued Adelphi University after being falsely accused of AI plagiarism; the court found the AI-detection-based findings to be baseless and expunged the record.

Confidence
Medium (multi-source)
Adelphi University2 sourcesPressPublicJan 2026
FI-0554Public SectorHigh
Tool Misuse

US DHS agents use AI surveillance to threaten legal observers as domestic terrorists

In January 2026, US Department of Homeland Security (DHS) agents used AI-enabled surveillance to identify and intimidate legal observers. In one instance, an agent threatened an observer by claiming she was now considered a domestic terrorist in a government database.

Confidence
Medium (multi-source)
US Department of Homeland Security3 sourcesPressPublicJan 2026
FI-0528Public SectorMedium
Tool Misuse

Gloucester City Council mayor deepfake video sparks political row

An independent councillor is reported to have created an AI-generated video of the Mayor of Gloucester, Ashley Bowkett, falsely claiming he blocked a budget investigation and laughing at the camera. The video prompted calls for stricter AI rules in politics.

Confidence
Medium (multi-source)
Gloucester City Council2 sourcesPressPublicJan 2026
FI-0553Public SectorMedium
Tool Misuse

US Border Patrol facial recognition scan leads to Global Entry revocation

A US Border Patrol agent identified a neighborhood observer using facial recognition software, which was allegedly followed by the revocation of the observer's Global Entry status. The incident is reported as part of a pattern of surveillance and intimidation of protesters and observers.

Confidence
High (multi-source, primary)
US Border Patrol3 sourcesPrimaryPublicJan 2026
FI-0019SaaSMedium
Tool Misuse

Internal copilot filed an executive-priority Jira ticket against the wrong project

A $4B B2B SaaS company's internal AI assistant created a Jira ticket against the wrong product line during a board-week prep cycle. The PM caught it 28 hours later.

Confidence
Steward-verified (NDA)
Anonymized: B2B SaaS · NA · $4B+ revenueSteward-verified · NDASep 2025
FI-0060Retail & E-commerceMedium
Tool Misuse

Taco Bell rethought its drive-thru voice AI after viral order failures

Taco Bell's parent company said it was reconsidering where to use AI voice ordering at drive-thrus after viral clips showed the system mishandling orders, including one prankster who got it to add 18,000 cups of water, jamming the order flow.

Confidence
Medium (multi-source)
Taco Bell (Yum Brands)2 sourcesPressPublicAug 2025
FI-0112Public SectorHigh
Tool Misuse

A New York court found NYPD misused facial-recognition AI, leading to false imprisonment

A New York Criminal Court found in People v Zuhdi A. that NYPD and FDNY officials used unauthorized facial recognition software (Clearview AI) instead of the approved limited database, illegally accessed DMV records without a court order, and altered a defendant photograph by modifying neck length before placing it in a photo array. The same pattern of misuse caused Trevis Williams to be falsely arrested and jailed for two days despite not matching the physical description and being miles away at the time of the crime. Both cases were ultimately dismissed.

Confidence
High (multi-source, primary)
New York Police Department (NYPD)3 sourcesCourt FilingPublicAug 2025
FI-0168Public SectorMedium
Tool Misuse

A disabled ChatGPT consent toggle instantly deleted a Cologne professor's two years of history

In August 2025, University of Cologne plant scientist Marcel Bucher turned off ChatGPT's 'Improve the model for everyone' data consent option, which immediately and irreversibly deleted his entire two-year chat history containing grant applications, teaching materials, and publication drafts. OpenAI confirmed the deletion was by design under its 'privacy by design' policy and offered no recovery. The incident was first reported by Nature in January 2026 and raised questions about whether bundling training consent withdrawal with data destruction complies with EU GDPR data portability requirements.

Confidence
Medium (multi-source)
University of Cologne3 sourcesPressPublicAug 2025
FI-0231Travel & HospitalityLow
Tool Misuse

British Airways chatbot fails to recognize London and Heathrow as valid entries

A British Airways chatbot failed to recognize London and Heathrow as valid inputs even after suggesting them as examples, blocking a user from finding their reservation.

Confidence
High (multi-source, primary)
British Airways2 sourcesPrimaryPublicApr 2025
FI-0013Retail & E-commerceHigh
Tool Misuse

McDonald's ended its IBM drive-through AI partnership after viral order failures

After three years of pilots and viral videos showing the AI ordering 260 chicken nuggets or topping ice cream with bacon, McDonald's ended the partnership in June 2024.

Confidence
Medium (multi-source)
McDonald's, IBM2 sourcesPressPublicJun 2024
FI-0221Fintech & PaymentsMedium
Tool Misuse

Hello Digit fined $2.7M for faulty automated savings algorithm

The CFPB penalized Hello Digit for deploying an automated savings tool that caused overdrafts, despite a no-overdraft guarantee. The agency ordered a civil penalty of $2.7 million and required redress to affected consumers; it also alleged that the company kept interest earned on consumer funds.

Confidence
High (multi-source, primary)
Hello Digit, LLC3 sourcesPrimaryPublicAug 2022
FI-0289Cross-industryMedium
Tool Misuse

Meta BlenderBot 3 public demo generated toxic and offensive language

In August 2022 Meta publicly demonstrated BlenderBot 3. Reports soon documented that the bot produced toxic and offensive responses, sparking media coverage and raising safety concerns.

Confidence
Medium (multi-source)
Meta Platforms, Inc.3 sourcesPressPublicAug 2022
FI-0185InsuranceHigh
Tool Misuse

UnitedHealthcare sued over automated algorithm delaying emergency claims

TeamHealth alleged that UnitedHealthcare used an automated algorithm to routinely deny or delay payments for emergency services based on diagnosis codes. The lawsuit claims these actions violate federal law and lead to systemic underpayment of providers.

Confidence
Medium (multi-source)
UnitedHealthcare2 sourcesPressPublicJul 2022