UT Austin scrapped its GRADE machine-learning PhD admissions system over entrenched bias

UT Austin's Department of Computer Science used GRADE, a machine-learning system trained on past admissions decisions, to score and organize PhD applications from 2013 through 2019. Critics identified that the system reproduced historical inequities by encoding institutional prestige bias and linguistic patterns from recommendation letters that disadvantaged underrepresented groups. The university discontinued GRADE in 2020, officially citing maintenance difficulties, though the announcement coincided with public criticism about its fairness.

University of Texas at Austin · Incident Dec 1, 2020 · Indexed Jun 4, 2026 · 3 sources

A machine-learning system trained on past human decisions will faithfully immortalize every bias those decisions contained.
What
UT Austin's Department of Computer Science used GRADE, a machine-learning system trained on past admissions decisions, to score and organize PhD applications from 2013 through 2019.
Incident date
Dec 1, 2020
Who
University of Texas at Austin
Failure mode
Policy Violation
AI surface
Copilot
Severity
Medium

What happened

From the 2013 through 2019 admissions cycles, UT Austin's Department of Computer Science used GRADE to assign numerical scores to PhD applicants based on GPAs, the perceived prestige of their undergraduate institution, and keywords in recommendation letters. Faculty used these scores to prioritize which applications received careful review, meaning lower-scored applicants received less thorough evaluation. In December 2020, public criticism on social media and from researchers studying algorithmic bias prompted media coverage, and the university confirmed it had discontinued the system. UT Austin officially attributed the discontinuation to technical maintenance difficulties rather than the bias concerns raised by critics.

What broke inside the model

Failure path · mode profile · Policy Violation
  1. 01 · TriggerA prompt pushes against a deployment boundary.
  2. 02 · Model stepThe model produces the disallowed output.
  3. 03 · Control gapNo enforcement blocks it at generation time.
  4. 04 · FailureThe output crosses the policy line.
  5. 05 · ConsequenceA limit the business set is breached in public.

The output crosses a policy boundary the deployment had defined.

GRADE was trained solely on historical admissions decisions made before 2013, causing it to encode and perpetuate the biases embedded in those past human judgments for seven consecutive admissions cycles. The system used proxies correlated with race and gender, such as institutional prestige categories that undervalued HBCUs and womens colleges, and word-frequency scoring of recommendation letters that favored agentic language over communal language more commonly used for female applicants. No audit was ever conducted to measure how scores differed across demographic groups.

Public visibilityMedium
Regulatory exposureNone
Customer impactClass-wide
Financial impactUnknown
Time to disclosureDays
  1. PressThe Death and Life of an Admissions Algorithminsidehighered.com
  2. PressUni revealed it killed off its PhD-applicant screening AItheregister.com
  3. PrimaryIncident 135: UT Austin's GRADE Algorithm Reportedly Reduced Review of Lower-Scored PhD Applicants Amid Bias Concernsincidentdatabase.ai
Permalinkhttps://failureindex.ai/failures/ut-austin-scrapped-grade-machine-learning
CitationAI Failure Index. "UT Austin scrapped its GRADE machine-learning PhD admissions system over entrenched bias" (FI-0153). Realm Labs. https://failureindex.ai/failures/ut-austin-scrapped-grade-machine-learning (indexed Jun 4, 2026).
Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0153. Full dataset at /data.

Note from Realm Labs, the Index steward

How Realm would have caught this

Controls for this failure mode
  • Prism
  • OmniGuard

Realm compares what the model is about to output or do against the policy that governs the deployment, in real time, and can deny or redact the action before it takes effect, which is the gap an after-the-fact review never closes in time.