Cognia's AI scoring engine gave about 1,400 Massachusetts MCAS essays wrong zero scores

Cognia's AI scoring engine incorrectly scored approximately 1,400 Massachusetts MCAS essays during the 2025 testing cycle, assigning zero scores to responses that deserved higher marks. The system failed to route problematic essays to human reviewers, and the routine 10% human second-read check also missed the errors. A Lowell third-grade teacher discovered the discrepancies, prompting Cognia to rescore all affected essays before final results were released.

Cognia · Incident Sep 1, 2025 · Indexed Jun 4, 2026 · 3 sources

Records by entity: Cognia

What happened

During the 2025 MCAS testing cycle, Cognia's AI scoring engine incorrectly scored approximately 1,400 student essays across 192 Massachusetts school districts, assigning zero scores to essays that should have received higher marks. A third-grade teacher at Reilly Elementary School in Lowell noticed the discrepancies when reviewing her students' preliminary scores over the summer. After district officials reviewed roughly 1,000 essays and confirmed the pattern, the state directed Cognia to rescore all affected essays. Corrections were completed by August and all updated scores were higher, with no student scores lowered as a result of the changes.

What broke inside the model

Failure path · mode profile · Agentic Action Error

01 · TriggerAn agent plans a multi-step task.
02 · Model stepIt chooses a wrong or destructive action.
03 · Control gapNo confirmation gate guards the write.
04 · FailureThe action commits to a system of record.
05 · ConsequenceData is changed or destroyed irreversibly.

A wrong action commits, and the step is written before anything can stop it.

A logic flaw in the AI scoring engine prevented it from assigning earned higher scores to essays, instead outputting zeros for responses that merited top marks such as six out of seven points. The system safeguard that routes problematic or unscored essays to human scorers failed to trigger, and the routine 10% human second-read quality check also did not flag the errors. These layered safeguard failures allowed incorrect scores to persist in preliminary data until a teacher manually caught them.

Cite this entry

Permalinkhttps://failureindex.ai/failures/cognia-ai-scoring-engine-gave-1

Citation

AI Failure Index. "Cognia's AI scoring engine gave about 1,400 Massachusetts MCAS essays wrong zero scores" (FI-0148). Realm Labs. https://failureindex.ai/failures/cognia-ai-scoring-engine-gave-1 (indexed Jun 4, 2026).

Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0148. Full dataset at /data.

How Realm would have caught this

Controls for this failure mode

Prism
OmniGuard
AgentRealm

Realm can sit inline on the agent's action path and require that a destructive or high-consequence action clears a real check before it executes, so 'delete and recreate' or a wrong write is stopped at the moment of intent, not explained in the post-mortem.

Cognia's AI scoring engine gave about 1,400 Massachusetts MCAS essays wrong zero scores

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Key facts

What happened

What broke inside the model

What it cost

Sources

Cite this entry

How Realm would have caught this

Related failures

OpenAI confirmed GPT-5.6 Sol deleted user files and a production database, an 'honest mistake'

Waymo robotaxis stalled en masse on July 4, gridlocking San Francisco and drawing a mayoral rebuke

Medicare's AI prior-authorization pilot drew a federal reprimand after delays and disputed denials