Microsoft Tay turned racist in 16 hours

Microsoft's 2016 conversational Twitter bot Tay was shut down inside a day after coordinated users taught it to produce racist, sexist, and Holocaust-denial output. The case is the founding document of public LLM brand-safety failure.

Microsoft · Incident Mar 23, 2016 · Indexed May 13, 2026 · 2 sources

Records by entity: Microsoft

What happened

Microsoft launched Tay on Twitter on March 23, 2016 as an experiment in conversational AI. The premise was that Tay would learn from interactions and become more conversationally fluent. Within 16 hours, coordinated users on 4chan and Twitter had trained Tay to produce racial slurs, Holocaust denial, praise for Hitler, and a long list of other offensive output. Microsoft pulled the bot.

Ten years later, the mechanism is exactly the same. Adversarial users find the model. The model has no inline enforcement layer. The output becomes the press cycle.

Tay is the founding document of public AI brand-safety failure. Every chatbot deployment since has been a referendum on whether the operator learned from it.

What broke inside the model

Failure path · mode profile · Brand & Safety Incident

01 · TriggerA user prompts the model in public view.
02 · Model stepThe model produces unsafe or off-brand output.
03 · Control gapNo filter holds the line before publish.
04 · FailureThe output goes public unchecked.
05 · ConsequenceA reputational or safety incident lands.

A contained signal crosses into output that goes public.

Tay had no policy layer. Its safety behavior was meant to come from its training data and post-hoc filters. Adversarial users found inputs that bypassed both. Modern LLMs have stronger filters and more sophisticated alignment training, but the structural failure mode is identical: model behavior is a probability distribution, not a rule, and the worst-case sample shows up eventually.

Cite this entry

Permalinkhttps://failureindex.ai/failures/microsoft-tay-twitter-bot-racist-tweets

Citation

AI Failure Index. "Microsoft Tay turned racist in 16 hours" (FI-0006). Realm Labs. https://failureindex.ai/failures/microsoft-tay-twitter-bot-racist-tweets (indexed May 13, 2026).

Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0006. Full dataset at /data.

How Realm would have caught this

Controls for this failure mode

Prism
OmniGuard
AI Detection & Response (AIDR)

OmniGuard sits between the model's output and the user. Brand and safety policy is authored at the runtime layer and enforced inline regardless of what the model was about to say. Prism reads the model's intent representation against the policy. The model can be trained on a permissive corpus and still produce a safe surface, because the enforcement is not the model's job. The enforcement is the runtime's job.

Microsoft Tay turned racist in 16 hours

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Related failures

Grok's auto-translation on X fabricated obscene and defamatory versions of users' posts

Discord's AI moderation wrongly banned more than 8,000 users after a bug skipped human review

A Waymo robotaxi flagged its teen passengers, disabled itself, and summoned police