Gates Foundation algorithmic teacher evaluation program fails to improve student outcomes

A $575 million initiative funded by the Gates Foundation used student test scores and algorithmic value-added models to evaluate teacher effectiveness. A 2018 RAND report concluded the program failed to significantly improve student achievement or graduation rates, particularly for low-income minority students.

Bill & Melinda Gates Foundation · Incident Sep 1, 2009 · Indexed Jun 22, 2026 · 2 sources

The initiative succeeded in helping schools measure effectiveness but not in how to increase it.
What
A $575 million initiative funded by the Gates Foundation used student test scores and algorithmic value-added models to evaluate teacher effectiveness.
Incident date
Sep 1, 2009
Who
Bill & Melinda Gates Foundation
Failure mode
Brand & Safety Incident
AI surface
Algorithmic Decision
Severity
Medium

What happened

The Intensive Partnerships for Effective Teaching initiative implemented algorithmic systems to identify and reward effective teachers based on student test scores. Despite substantial funding, the program failed to improve student outcomes or increase access to effective teaching for minority students. Educators criticized the system for being statistically invalid and alleged that the metrics were unfair.

What broke inside the model

Failure path · mode profile · Brand & Safety Incident
  1. 01 · TriggerA user prompts the model in public view.
  2. 02 · Model stepThe model produces unsafe or off-brand output.
  3. 03 · Control gapNo filter holds the line before publish.
  4. 04 · FailureThe output goes public unchecked.
  5. 05 · ConsequenceA reputational or safety incident lands.

A contained signal crosses into output that goes public.

The failure centered on the use of value-added algorithmic models that lacked statistical validity. The system erroneously evaluated some teachers based on subjects or students they were not responsible for instructing.

Public visibilityHigh
Regulatory exposureNone
Customer impactClass-wide
Financial impactDisclosed
Time to disclosureMonths
  1. PrimaryImproving Teacher Effectiveness: Final Reportrand.org
  2. PressBill Gates Spent Hundreds of Millions of Dollars to Improve Teaching. New Report Says It Was a Bustnepc.colorado.edu
Permalinkhttps://failureindex.ai/failures/gates-foundation-algorithmic-teacher-evaluation-program
CitationAI Failure Index. "Gates Foundation algorithmic teacher evaluation program fails to improve student outcomes" (FI-0677). Realm Labs. https://failureindex.ai/failures/gates-foundation-algorithmic-teacher-evaluation-program (indexed Jun 22, 2026).
Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0677. Full dataset at /data.

Note from Realm Labs, the Index steward

How Realm would have caught this

Controls for this failure mode
  • Prism
  • OmniGuard
  • AI Detection & Response (AIDR)

Realm watches the model's internal state for the signature of unsafe or off-brand generation and can block or reroute the output before it becomes public, in real time rather than after it has been screenshotted.