AI text detectors misclassified human writing as AI generated

AI-generated text detectors from OpenAI and other providers frequently misclassified human-written text as AI-generated. This led to a high rate of false positives, particularly impacting non-native English speakers and leading to false accusations of academic dishonesty.

OpenAI, Originality.ai, and Edward Tian (GPTZero) · Incident Jan 31, 2023 · Indexed Jun 22, 2026 · 4 sources

AI detectors frequently misclassified human writing as AI-generated, leading to false accusations against students.
What
AI-generated text detectors from OpenAI and other providers frequently misclassified human-written text as AI-generated.
Incident date
Jan 31, 2023
Who
OpenAI, Originality.ai, and Edward Tian (GPTZero)
Failure mode
Hallucination
AI surface
Search / RAG
Severity
Medium

What happened

In early 2023, AI text detectors from OpenAI, Originality.ai, and others were deployed to identify AI-generated content. Users reported that the tools frequently flagged human-written text as AI-generated, leading to false accusations of plagiarism against students. OpenAI eventually shut down its classifier in July 2023 due to its low rate of accuracy.

What broke inside the model

Failure path · mode profile · Hallucination
  1. 01 · TriggerA user asks for a fact, a citation, or a figure.
  2. 02 · Model stepThe model writes a fluent, confident answer.
  3. 03 · Control gapNothing ties the claim back to a real source.
  4. 04 · FailureA fabricated fact ships as if it were verified.
  5. 05 · ConsequenceThe false claim reaches a customer, a court, or the public.

Confidence holds, and even spikes, as the claim detaches from any source.

The detectors relied on statistical patterns and perplexity metrics that overlapped significantly between AI-generated and human-written text. This resulted in a failure to distinguish between the two, especially for non-native English speakers whose writing often appears more predictable.

Public visibilityHigh
Regulatory exposurePossible
Customer impactClass-wide
Financial impactUnknown
Time to disclosureMonths
  1. PrimaryNew AI classifier for indicating AI-written textopenai.com
  2. PressEdward Tian '23 creates GPTZero, software to detect plagiarism from AI bot ChatGPTdailyprincetonian.com
  3. PressOpenAI's text classifier won't calm fears about AI-written textrealworlddatascience.net
  4. PressAI detection false positives: why original writing gets flaggedtryleap.ai
Permalinkhttps://failureindex.ai/failures/text-detectors-misclassified-human-writing-generated
CitationAI Failure Index. "AI text detectors misclassified human writing as AI generated" (FI-0649). Realm Labs. https://failureindex.ai/failures/text-detectors-misclassified-human-writing-generated (indexed Jun 22, 2026).
Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0649. Full dataset at /data.

Note from Realm Labs, the Index steward

How Realm would have caught this

Controls for this failure mode
  • Prism
  • OmniGuard
  • AI Detection & Response (AIDR)

A runtime layer that watches the model's internal state can flag the moment a model commits to a claim it has no support for, and hold or reroute the response before it reaches a user. Realm reads those signals in real time rather than grading the transcript after the fact.