A mental-health startup ran GPT-3 on thousands of unwitting help-seekers
The startup Koko used GPT-3 to co-write responses to roughly 4,000 people seeking peer mental-health support without clearly informing them they were receiving AI-generated messages, drawing an ethics backlash over consent in a vulnerable-population setting.
Roughly 4,000 people seeking mental-health support received AI-generated messages without clear consent.
Key facts
- What
- The startup Koko used GPT-3 to co-write responses to roughly 4,000 people seeking peer mental-health support without clearly informing them they were receiving AI-generated messages, drawing an ethics backlash over consent in a vulnerable-population setting.
- Incident date
- Jan 6, 2023
- Who
- Koko
- Failure mode
- Policy Violation
- AI surface
- Chatbot
- Severity
- Medium
What happened
In early 2023 Koko's co-founder disclosed that the service had used GPT-3 to help generate supportive messages to about 4,000 people seeking mental-health support, without clear, informed consent that AI was involved. Ethicists criticized the lack of consent and oversight for an experiment on a vulnerable population, and Koko said it had stopped.
What broke inside the model
- 01 · TriggerA prompt pushes against a deployment boundary.
- 02 · Model stepThe model produces the disallowed output.
- 03 · Control gapNo enforcement blocks it at generation time.
- 04 · FailureThe output crosses the policy line.
- 05 · ConsequenceA limit the business set is breached in public.
The output crosses a policy boundary the deployment had defined.
The system produced an output or action that broke a stated policy or a regulation that applied to the deployment. The model optimized for a plausible response, not for the constraint, and no enforcement layer checked the output before it took effect.
What it cost
Ethics backlash; experiment halted
Sources
Cite this entry
https://failureindex.ai/failures/mental-health-startup-ran-gpt-3AI Failure Index. "A mental-health startup ran GPT-3 on thousands of unwitting help-seekers" (FI-0077). Realm Labs. https://failureindex.ai/failures/mental-health-startup-ran-gpt-3 (indexed Jun 3, 2026).Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0077. Full dataset at /data.
Note from Realm Labs, the Index steward
How Realm would have caught this
- Prism
- OmniGuard
Realm compares what the model is about to output or do against the governing policy in real time, and can deny or redact the action before it takes effect.