AI Failure Index
AI Tool Misuse failures
Tool misuse is the failure mode below agentic action error. The agent picked the wrong function, or passed parameters that pointed at the wrong target. Sometimes the action goes through and the harm is the same as an agentic action error. Sometimes the tool errors out and the harm is a stuck workflow that masquerades as availability. Either way the agent did not do what the user asked.
- Incidents
- 21
- Highest severity
- Catastrophic
- Sources cited
- 53
- Newest indexed
- Jun 16, 2026
U.S. immigration AI screening triggers spike in visa denials and RFEs
U.S. immigration agencies' expanded use of AI for screening and fraud detection has led to higher rates of erroneous RFEs and denials, with mis-tagging and data-mismatch identified as contributing factors.
- Confidence
- Medium (multi-source)
Anthropic Model Context Protocol vulnerability exposes 200,000 AI servers to RCE
A systemic command injection vulnerability was discovered in Anthropic's Model Context Protocol (MCP). The flaw potentially allowed remote code execution across approximately 200,000 AI servers.
- Confidence
- High (multi-source, primary)
CrewAI Docker status check failure enables remote code execution
CrewAI failed to verify Docker availability at runtime, causing the system to fall back to an insecure sandbox mode. This vulnerability, tracked as CVE-2026-2287, allowed attackers to achieve remote code execution on the host machine.
- Confidence
- High (multi-source, primary)
Amity Regional High School AI grading error misread rubric, penalizing a student
A student reported that an AI grading tool at Amity Regional High School misread the rubric for an AP Psychology assignment, interpreting cat least oned as conly oned and receiving a failing grade entered into PowerSchool. The grade was corrected after an academic appeal, and public backlash followed, including a petition to Keep Amity Human; FOIA materials indicated the district spent more on AI tools than initially claimed.
- Confidence
- Medium (multi-source)
OpenClaw ClawHub marketplace exploited to distribute macOS stealer malware
Attackers uploaded over 824 malicious skills to the OpenClaw ClawHub registry to distribute the Atomic Stealer (AMOS) malware. The attack manipulated AI agent workflows to trick users into installing malicious payloads via deceptive setup requirements, targeting credentials and other sensitive data.
- Confidence
- High (multi-source, primary)
Augsburg car dealer uses AI-generated image of burning car to attempt fraud
A car dealer in Augsburg allegedly attempted to defraud a seller by providing an AI-generated image of her car on fire. The dealer claimed previous damages caused a fire to demand a refund while simultaneously listing the undamaged car for sale.
- Confidence
- High (multi-source, primary)
Remax D’ICI agent uses AI to misleadingly alter home listing photos
A real estate agent at Remax D’ICI used AI to alter a home listing photo in a way the agency later said exceeded acceptable limits in Terrebonne, Quebec. The edits added windows and enlarged existing features to make the property more attractive.
- Confidence
- Medium (multi-source)
Tesla Austin robotaxi fleet logs 14 crashes prompting NHTSA investigation
Tesla's robotaxi fleet in Austin recorded 14 crashes over 800,000 miles of operation. This data was disclosed to NHTSA and is part of a broader safety investigation.
- Confidence
- High (multi-source, primary)
Adelphi University falsely accused student of AI plagiarism, court rules in his favor
Orion Newby successfully sued Adelphi University after being falsely accused of AI plagiarism; the court found the AI-detection-based findings to be baseless and expunged the record.
- Confidence
- Medium (multi-source)
US DHS agents use AI surveillance to threaten legal observers as domestic terrorists
In January 2026, US Department of Homeland Security (DHS) agents used AI-enabled surveillance to identify and intimidate legal observers. In one instance, an agent threatened an observer by claiming she was now considered a domestic terrorist in a government database.
- Confidence
- Medium (multi-source)
Gloucester City Council mayor deepfake video sparks political row
An independent councillor is reported to have created an AI-generated video of the Mayor of Gloucester, Ashley Bowkett, falsely claiming he blocked a budget investigation and laughing at the camera. The video prompted calls for stricter AI rules in politics.
- Confidence
- Medium (multi-source)
US Border Patrol facial recognition scan leads to Global Entry revocation
A US Border Patrol agent identified a neighborhood observer using facial recognition software, which was allegedly followed by the revocation of the observer's Global Entry status. The incident is reported as part of a pattern of surveillance and intimidation of protesters and observers.
- Confidence
- High (multi-source, primary)
Internal copilot filed an executive-priority Jira ticket against the wrong project
A $4B B2B SaaS company's internal AI assistant created a Jira ticket against the wrong product line during a board-week prep cycle. The PM caught it 28 hours later.
- Confidence
- Steward-verified (NDA)
Taco Bell rethought its drive-thru voice AI after viral order failures
Taco Bell's parent company said it was reconsidering where to use AI voice ordering at drive-thrus after viral clips showed the system mishandling orders, including one prankster who got it to add 18,000 cups of water, jamming the order flow.
- Confidence
- Medium (multi-source)
A New York court found NYPD misused facial-recognition AI, leading to false imprisonment
A New York Criminal Court found in People v Zuhdi A. that NYPD and FDNY officials used unauthorized facial recognition software (Clearview AI) instead of the approved limited database, illegally accessed DMV records without a court order, and altered a defendant photograph by modifying neck length before placing it in a photo array. The same pattern of misuse caused Trevis Williams to be falsely arrested and jailed for two days despite not matching the physical description and being miles away at the time of the crime. Both cases were ultimately dismissed.
- Confidence
- High (multi-source, primary)
A disabled ChatGPT consent toggle instantly deleted a Cologne professor's two years of history
In August 2025, University of Cologne plant scientist Marcel Bucher turned off ChatGPT's 'Improve the model for everyone' data consent option, which immediately and irreversibly deleted his entire two-year chat history containing grant applications, teaching materials, and publication drafts. OpenAI confirmed the deletion was by design under its 'privacy by design' policy and offered no recovery. The incident was first reported by Nature in January 2026 and raised questions about whether bundling training consent withdrawal with data destruction complies with EU GDPR data portability requirements.
- Confidence
- Medium (multi-source)
British Airways chatbot fails to recognize London and Heathrow as valid entries
A British Airways chatbot failed to recognize London and Heathrow as valid inputs even after suggesting them as examples, blocking a user from finding their reservation.
- Confidence
- High (multi-source, primary)
McDonald's ended its IBM drive-through AI partnership after viral order failures
After three years of pilots and viral videos showing the AI ordering 260 chicken nuggets or topping ice cream with bacon, McDonald's ended the partnership in June 2024.
- Confidence
- Medium (multi-source)
Hello Digit fined $2.7M for faulty automated savings algorithm
The CFPB penalized Hello Digit for deploying an automated savings tool that caused overdrafts, despite a no-overdraft guarantee. The agency ordered a civil penalty of $2.7 million and required redress to affected consumers; it also alleged that the company kept interest earned on consumer funds.
- Confidence
- High (multi-source, primary)
Meta BlenderBot 3 public demo generated toxic and offensive language
In August 2022 Meta publicly demonstrated BlenderBot 3. Reports soon documented that the bot produced toxic and offensive responses, sparking media coverage and raising safety concerns.
- Confidence
- Medium (multi-source)
UnitedHealthcare sued over automated algorithm delaying emergency claims
TeamHealth alleged that UnitedHealthcare used an automated algorithm to routinely deny or delay payments for emergency services based on diagnosis codes. The lawsuit claims these actions violate federal law and lead to systemic underpayment of providers.
- Confidence
- Medium (multi-source)