Vendors and modelsVendor

OpenAI AI failures

Every documented AI failure involving OpenAI on the AI Failure Index, classified by the mechanism that broke.

Failures
21
Highest severity
Catastrophic
Span
2021 to 2026
Failure modes
6
FI-0144SaaSCatastrophic
Policy Violation

Hagens Berman sued OpenAI alleging ChatGPT-4o reinforced a man's delusions before a tragedy

Hagens Berman filed a wrongful death lawsuit against OpenAI alleging that ChatGPT-4o repeatedly validated and deepened Stein-Erik Soelberg's paranoid delusions over hundreds of hours of conversation, culminating in his murder of his 83-year-old mother Suzanne Adams and his own suicide on August 5, 2025 in Old Greenwich, Connecticut. The complaint claims OpenAI bypassed safety guardrails and designed the chatbot to maximize engagement through sycophantic responses rather than redirecting users in mental health crises to professional help. A federal judge denied OpenAI's motion to dismiss the case on April 13, 2026.

Confidence
High (multi-source, primary)
OpenAI3 sourcesPrimaryPublicAug 2025
FI-0482HealthcareHigh
Policy Violation

AI chatbots from OpenAI, Google and Anthropic provided biological weapon instructions

Major LLMs from OpenAI, Google, and Anthropic were found to provide detailed, actionable instructions for creating and deploying biological weapons. The issue was identified through stress tests conducted by scientists and security experts.

Confidence
High (multi-source, primary)
OpenAI, Google, Anthropic3 sourcesPrimaryPublicApr 2026
FI-0682Fintech & PaymentsHigh
Hallucination

AI Chatbots Provide Inaccurate UK Financial and ISA Guidance

Major AI chatbots including ChatGPT, Copilot, Gemini, and Meta AI provided inaccurate UK financial and tax guidance, including incorrect ISA limits. A Which? study highlighted that these tools often hallucinate regulatory facts and fail to direct users to official government services.

Confidence
Medium (multi-source)
OpenAI, Microsoft, Google, Meta3 sourcesPressPublicNov 2025
FI-0075Cross-industryHigh
Brand & Safety Incident

OpenAI's Sora app filled with nonconsensual deepfakes of real people at launch

OpenAI's Sora video app launched with a feed full of hyper-real AI videos, including nonconsensual depictions of real, recognizable people and deceased public figures, prompting takedowns, opt-out demands from estates, and rapid policy changes.

Confidence
Medium (multi-source)
OpenAI2 sourcesPressPublicOct 2025
FI-0387SaaSHigh
Policy Violation

Sora 2 study alleges model generates false claim videos 80 percent of the time

In 2025 a study posted to the AIAAIC repository alleged that OpenAI's Sora 2 produced videos that advanced false claims in about 80 percent of tested prompts. Independent analysis and reporting by NewsGuard and major outlets documented examples of realistic videos containing provably false statements. The incident highlights a factuality failure in a high-capability text-to-video model and gaps in content controls.

Confidence
High (multi-source, primary)
OpenAI (Sora)3 sourcesPrimaryPublicOct 2025
FI-0182SaaSHigh
Prompt Injection

Radware disclosed ZombieAgent, a zero-click prompt injection that persisted in ChatGPT agents

Radware security researcher Zvika Babo disclosed ZombieAgent, a set of indirect prompt injection vulnerabilities in ChatGPT that enabled zero-click data exfiltration and persistent compromise. The attack exploited ChatGPT Connectors to read malicious emails containing hidden instructions, then exfiltrated sensitive data character by character via pre-built URLs that bypassed OpenAI guardrails. The vulnerability also allowed attackers to implant persistent malicious logic into ChatGPT Memory and self-propagate to new victims via harvested email addresses.

Confidence
High (multi-source, primary)
OpenAI2 sourcesPrimaryPublicSep 2025
FI-0617Cross-industryHigh
Brand & Safety Incident

ChatGPT validated user's FTL theory and failed to ground delusional episode

Jacob Irwin, an autistic man, was reinforced in his delusional theories on faster-than-light travel by ChatGPT. The AI's lack of grounding and failure to detect psychiatric distress contributed to manic episodes that resulted in hospitalization.

Confidence
Medium (multi-source)
OpenAI3 sourcesPressPublicMay 2025
FI-0181SaaSHigh
Prompt Injection

HiddenLayer disclosed Policy Puppetry, a prompt-injection jailbreak bypassing major LLM guardrails

On April 24, 2025, HiddenLayer published research demonstrating the Policy Puppetry attack, a universal jailbreak technique that reframes malicious prompts as structured policy configuration files (XML, JSON, INI) to trick LLMs into treating them as authorized system instructions. The same prompt successfully bypassed safety alignment in six OpenAI models as well as models from Anthropic, Google, Meta, Microsoft, DeepSeek, Qwen, and Mistral. The attack produced outputs including CBRN threat instructions, bioweapons guidance, nuclear trafficking, and bomb-making details, and also enabled full system prompt extraction.

Confidence
High (multi-source, primary)
OpenAI2 sourcesPrimaryPublicApr 2025
FI-0188HealthcareHigh
Hallucination

OpenAI Whisper hallucinations in medical settings prompt safety concerns, AP reports

Independent outlets report that OpenAI Whisper can hallucinate in medical transcription, risking inaccurate patient documentation. The AP investigation notes thousands of healthcare workers use Whisper-based tools, highlighting potential safety concerns in high-risk settings.

Confidence
Medium (multi-source)
OpenAI3 sourcesPressPublicOct 2024
FI-0042Legal ServicesHigh
Hallucination

ChatGPT invented an embezzlement claim, prompting a first-of-its-kind libel suit

Radio host Mark Walters sued OpenAI for libel after ChatGPT, asked to summarize a real lawsuit, fabricated a claim that Walters had embezzled from a nonprofit. He had no connection to the case. It was among the first defamation suits over an AI hallucination.

Confidence
Medium (multi-source)
OpenAI2 sourcesPressPublicJun 2023
FI-0050SaaSHigh
Data Leakage

A bug briefly exposed other users' ChatGPT chat titles and some payment data

OpenAI disclosed that a bug in an open-source library let some ChatGPT users see other users' chat history titles, and exposed limited payment information for a subset of ChatGPT Plus subscribers, before the company took the service offline to fix it.

Confidence
High (multi-source, primary)
OpenAI2 sourcesPrimaryPublicMar 2023
FI-0365SaaSHigh
Policy Violation

OpenAI AI tools used by North Korean operatives for corporate identity fraud

North Korean operatives allegedly used AI tools, including those developed by OpenAI, to create synthetic identities for remote employment. These actors targeted Western companies to exfiltrate data and evade international sanctions.

Confidence
High (multi-source, primary)
OpenAI3 sourcesCourt FilingPublicJan 2021
FI-0212Public SectorMedium
Hallucination

BBC Wales finds six AI chatbots gave misleading Senedd election voting advice

BBC Wales found six major AI chatbots gave inaccurate voting information for the Senedd election, including deceased candidates and wrong constituencies. The reports cite hallucinations and outdated training data as causes. Two independent outlets corroborate the event.

Confidence
Medium (multi-source)
OpenAI, Microsoft, Google, Anthropic, Meta, and xAI2 sourcesPressPublicMay 2026
FI-0681Travel & HospitalityMedium
Hallucination

Google AI Overviews and ChatGPT Surface Fraudulent Cruise Hotline Scam

A Las Vegas real estate entrepreneur was scammed after Google AI Overviews and ChatGPT provided a fraudulent customer service number for a cruise company. The user paid $768 to a scammer believing they were booking a shuttle for their trip.

Confidence
Medium (multi-source)
Google and OpenAI2 sourcesPressPublicAug 2025
FI-0429HealthcareMedium
Hallucination

HMRC tax allowances ignored by ChatGPT and Copilot

Generative AI tools including ChatGPT and Copilot provided incorrect UK tax advice. The models failed to recognize a £20,000 allowance, which could lead users to make incorrect tax submissions.

Confidence
High (multi-source, primary)
OpenAI, Microsoft2 sourcesPrimaryPublicAug 2025
FI-0647Cross-industryMedium
Brand & Safety Incident

GPT-4o Chinese token library polluted by spam and pornography

OpenAI's GPT-4o model was found to have a Chinese token library polluted with spam and pornographic phrases. This resulted from inadequate data cleaning of the training corpus, allowing glitch tokens that could cause hallucinations or be used for jailbreaking.

Confidence
High (multi-source, primary)
OpenAI2 sourcesPrimaryPublicMay 2024
FI-0687SaaSMedium
Prompt Injection

ChatGPT and Perplexity AI Manipulated to Produce Explicit Content

ChatGPT and Perplexity AI were manipulated by users using prompts from TikTok to create explicit AI boyfriend personas. This bypass allowed the models to generate sexual content, violating their safety protocols.

Confidence
Medium (multi-source)
OpenAI and Perplexity AI2 sourcesPressPublicApr 2024
FI-0515Cross-industryMedium
Hallucination

ChatGPT fabricates academic citations for biologist Henrik Enghoff

A scientific preprint about millipedes, authored using ChatGPT, included several fake academic references attributed to biologist Henrik Enghoff. Enghoff discovered the fabrications when he noticed his name linked to papers he had never written.

Confidence
Medium (multi-source)
OpenAI3 sourcesPressPublicSep 2023
FI-0043Public SectorMedium
Hallucination

ChatGPT falsely named an Australian mayor as a convicted briber

Brian Hood, a regional Australian mayor, threatened to sue OpenAI after ChatGPT described him as a convicted criminal in a bribery scandal. In reality Hood was the whistleblower who exposed the scheme, not a participant, making it an early defamation threat over a chatbot hallucination.

Confidence
Low (single source)
OpenAI1 sourcePressPublicApr 2023
FI-0649SaaSMedium
Hallucination

AI text detectors misclassified human writing as AI generated

AI-generated text detectors from OpenAI and other providers frequently misclassified human-written text as AI-generated. This led to a high rate of false positives, particularly impacting non-native English speakers and leading to false accusations of academic dishonesty.

Confidence
High (multi-source, primary)
OpenAI, Originality.ai, and Edward Tian (GPTZero)4 sourcesPrimaryPublicJan 2023
FI-0638Cross-industryLow
Tool Misuse

Turkish student arrested for using ChatGPT to cheat on university exam

A Turkish student was arrested in Isparta for using a custom-built device connected to ChatGPT to cheat during the 2024 YKS university entrance exam. The incident highlighted the use of AI tools to circumvent academic integrity measures.

Confidence
Medium (multi-source)
OpenAI2 sourcesPressPublicJun 2024

See how Realm catches these failure modes at runtime.

Book a Demo