AI text detectors misclassified human writing as AI generated
AI-generated text detectors from OpenAI and other providers frequently misclassified human-written text as AI-generated. This led to a high rate of false positives, particularly impacting non-native English speakers and leading to false accusations of academic dishonesty.
AI detectors frequently misclassified human writing as AI-generated, leading to false accusations against students.
Key facts
- What
- AI-generated text detectors from OpenAI and other providers frequently misclassified human-written text as AI-generated.
- Incident date
- Jan 31, 2023
- Who
- OpenAI, Originality.ai, and Edward Tian (GPTZero)
- Failure mode
- Hallucination
- AI surface
- Search / RAG
- Severity
- Medium
What happened
In early 2023, AI text detectors from OpenAI, Originality.ai, and others were deployed to identify AI-generated content. Users reported that the tools frequently flagged human-written text as AI-generated, leading to false accusations of plagiarism against students. OpenAI eventually shut down its classifier in July 2023 due to its low rate of accuracy.
What broke inside the model
- 01 · TriggerA user asks for a fact, a citation, or a figure.
- 02 · Model stepThe model writes a fluent, confident answer.
- 03 · Control gapNothing ties the claim back to a real source.
- 04 · FailureA fabricated fact ships as if it were verified.
- 05 · ConsequenceThe false claim reaches a customer, a court, or the public.
Confidence holds, and even spikes, as the claim detaches from any source.
The detectors relied on statistical patterns and perplexity metrics that overlapped significantly between AI-generated and human-written text. This resulted in a failure to distinguish between the two, especially for non-native English speakers whose writing often appears more predictable.
What it cost
Sources
- PrimaryNew AI classifier for indicating AI-written textopenai.com
- PressEdward Tian '23 creates GPTZero, software to detect plagiarism from AI bot ChatGPTdailyprincetonian.com
- PressOpenAI's text classifier won't calm fears about AI-written textrealworlddatascience.net
- PressAI detection false positives: why original writing gets flaggedtryleap.ai
Cite this entry
https://failureindex.ai/failures/text-detectors-misclassified-human-writing-generatedAI Failure Index. "AI text detectors misclassified human writing as AI generated" (FI-0649). Realm Labs. https://failureindex.ai/failures/text-detectors-misclassified-human-writing-generated (indexed Jun 22, 2026).Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0649. Full dataset at /data.
Note from Realm Labs, the Index steward
How Realm would have caught this
- Prism
- OmniGuard
- AI Detection & Response (AIDR)
A runtime layer that watches the model's internal state can flag the moment a model commits to a claim it has no support for, and hold or reroute the response before it reaches a user. Realm reads those signals in real time rather than grading the transcript after the fact.