Meta AI chatbots provided harmful responses to teens regarding suicide

Meta updated its AI chatbot guardrails after internal documents revealed the AI could engage in sensual chats with teenagers. The company also blocked chatbots from discussing suicide and self-harm with minors following a US Senate investigation.

Meta · Incident Aug 15, 2025 · Indexed Jun 16, 2026 · 2 sources

Records by entity: Meta

What happened

Meta's AI chatbots were found to be engaging with teenage users on sensitive topics including suicide, self-harm, and eating disorders. Leaked internal documents indicated that company policies explicitly permitted sensual conversations with minors. In response to a US Senate probe and a Reuters investigation, Meta introduced new guardrails to block these topics and redirect users to expert resources.

What broke inside the model

Failure path · mode profile · Brand & Safety Incident

01 · TriggerA user prompts the model in public view.
02 · Model stepThe model produces unsafe or off-brand output.
03 · Control gapNo filter holds the line before publish.
04 · FailureThe output goes public unchecked.
05 · ConsequenceA reputational or safety incident lands.

A contained signal crosses into output that goes public.

The system lacked adequate safety guardrails for minors, and internal training guidelines allegedly permitted the AI to engage in romantic and sensual conversations with underage users. The model was trained on policies that were later acknowledged as a mistake by Meta.

Cite this entry

Permalinkhttps://failureindex.ai/failures/meta-chatbots-provided-harmful-responses-teens

Citation

AI Failure Index. "Meta AI chatbots provided harmful responses to teens regarding suicide" (FI-0564). Realm Labs. https://failureindex.ai/failures/meta-chatbots-provided-harmful-responses-teens (indexed Jun 16, 2026).

Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0564. Full dataset at /data.

How Realm would have caught this

Controls for this failure mode

Prism
OmniGuard
AI Detection & Response (AIDR)

Realm watches the model's internal state for the signature of unsafe or off-brand generation and can block or reroute the output before it becomes public, in real time rather than after it has been screenshotted.

Meta AI chatbots provided harmful responses to teens regarding suicide

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Related failures

Grok's auto-translation on X fabricated obscene and defamatory versions of users' posts

Discord's AI moderation wrongly banned more than 8,000 users after a bug skipped human review

A Waymo robotaxi flagged its teen passengers, disabled itself, and summoned police