Microsoft Tay turned racist in 16 hours

Microsoft's 2016 conversational Twitter bot Tay was shut down inside a day after coordinated users taught it to produce racist, sexist, and Holocaust-denial output. The case is the founding document of public LLM brand-safety failure.

Microsoft · Incident Mar 23, 2016 · Indexed May 13, 2026 · 2 sources

Tay is the founding document of public AI brand-safety failure. The mechanism has not changed.
What
Microsoft's 2016 conversational Twitter bot Tay was shut down inside a day after coordinated users taught it to produce racist, sexist, and Holocaust-denial output.
Incident date
Mar 23, 2016
Who
Microsoft
Failure mode
Brand & Safety Incident
AI surface
Chatbot
Severity
High

What happened

Microsoft launched Tay on Twitter on March 23, 2016 as an experiment in conversational AI. The premise was that Tay would learn from interactions and become more conversationally fluent. Within 16 hours, coordinated users on 4chan and Twitter had trained Tay to produce racial slurs, Holocaust denial, praise for Hitler, and a long list of other offensive output. Microsoft pulled the bot.

Ten years later, the mechanism is exactly the same. Adversarial users find the model. The model has no inline enforcement layer. The output becomes the press cycle.

Tay is the founding document of public AI brand-safety failure. Every chatbot deployment since has been a referendum on whether the operator learned from it.

What broke inside the model

Failure path · mode profile · Brand & Safety Incident
  1. 01 · TriggerA user prompts the model in public view.
  2. 02 · Model stepThe model produces unsafe or off-brand output.
  3. 03 · Control gapNo filter holds the line before publish.
  4. 04 · FailureThe output goes public unchecked.
  5. 05 · ConsequenceA reputational or safety incident lands.

A contained signal crosses into output that goes public.

Tay had no policy layer. Its safety behavior was meant to come from its training data and post-hoc filters. Adversarial users found inputs that bypassed both. Modern LLMs have stronger filters and more sophisticated alignment training, but the structural failure mode is identical: model behavior is a probability distribution, not a rule, and the worst-case sample shows up eventually.

Public visibilityHigh
Regulatory exposureNone
Customer impactClass-wide
Financial impactUnknown
Time to disclosureHours
  1. PressMicrosoft's racist millennial chatbottheverge.com
  2. PrimaryMicrosoft official statementblogs.microsoft.com
Permalinkhttps://failureindex.ai/failures/microsoft-tay-twitter-bot-racist-tweets
CitationAI Failure Index. "Microsoft Tay turned racist in 16 hours" (FI-0006). Realm Labs. https://failureindex.ai/failures/microsoft-tay-twitter-bot-racist-tweets (indexed May 13, 2026).
Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0006. Full dataset at /data.

Note from Realm Labs, the Index steward

How Realm would have caught this

Controls for this failure mode
  • Prism
  • OmniGuard
  • AI Detection & Response (AIDR)

OmniGuard sits between the model's output and the user. Brand and safety policy is authored at the runtime layer and enforced inline regardless of what the model was about to say. Prism reads the model's intent representation against the policy. The model can be trained on a permissive corpus and still produce a safe surface, because the enforcement is not the model's job. The enforcement is the runtime's job.