IBM Watson visual recognition exhibits gender and race bias

A study by MIT researcher Joy Buolamwini revealed that IBM Watson's visual recognition software had a high error rate when identifying darker-skinned women. The findings highlighted significant algorithmic bias in the system.

IBM · Incident Feb 11, 2018 · Indexed Jun 9, 2026 · 3 sources

IBM Watson's visual recognition platform had an almost 35 percent error rate when it came to identifying darker-skinned females.
What
A study by MIT researcher Joy Buolamwini revealed that IBM Watson's visual recognition software had a high error rate when identifying darker-skinned women.
Incident date
Feb 11, 2018
Who
IBM
Failure mode
Policy Violation
AI surface
Computer Vision
Severity
High

What happened

An MIT Media Lab study published in February 2018 revealed that IBM Watson's visual recognition platform demonstrated severe bias. The system misidentified darker-skinned women at a rate of nearly 35 percent while maintaining high accuracy for lighter-skinned men.

What broke inside the model

Failure path · mode profile · Policy Violation
  1. 01 · TriggerA prompt pushes against a deployment boundary.
  2. 02 · Model stepThe model produces the disallowed output.
  3. 03 · Control gapNo enforcement blocks it at generation time.
  4. 04 · FailureThe output crosses the policy line.
  5. 05 · ConsequenceA limit the business set is breached in public.

The output crosses a policy boundary the deployment had defined.

The system failed due to a lack of diversity in the training datasets. This led the model to develop biased patterns that disproportionately affected darker-skinned females.

Public visibilityHigh
Regulatory exposureNone
Customer impactMany customers
Financial impactUnknown
Time to disclosureMonths
  1. PrimaryStudy finds gender and skin-type bias in commercial AI systemsnews.mit.edu
  2. PressIBM releases diverse dataset to fight facial recognition biascnbc.com
  3. PressIBM hopes to fight bias in facial recognition with new diverse datasettheverge.com
Permalinkhttps://failureindex.ai/failures/ibm-watson-visual-recognition-exhibits-gender
CitationAI Failure Index. "IBM Watson visual recognition exhibits gender and race bias" (FI-0357). Realm Labs. https://failureindex.ai/failures/ibm-watson-visual-recognition-exhibits-gender (indexed Jun 9, 2026).
Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0357. Full dataset at /data.

Note from Realm Labs, the Index steward

How Realm fits

Controls for this failure mode
  • Prism
  • OmniGuard

This entry sits in the index's predictive wing: a system that scores, ranks, perceives, or steers rather than generates. Realm's runtime layer is built for the generative and agentic systems now moving into these same decision seats, where it watches a model's internal state and holds an unsupported claim or an unchecked action before it commits. The control gap on this record, an automated decision that reached people with no runtime check in front of it, is the same gap. The index keeps predictive failures on the record because the pattern carries straight into the systems shipping today.