Epic's sepsis prediction model missed two-thirds of cases with 88% false alarms, a study found
The Epic Sepsis Model, a proprietary sepsis prediction algorithm embedded in Epic's electronic health record platform and deployed at hundreds of US hospitals, was found to miss 67% of sepsis cases while generating 88% false alarms in an independent external validation published in JAMA Internal Medicine in June 2021. The model's discrimination (AUC 0.63) was substantially worse than Epic's claimed performance (AUC 0.76 to 0.83). Epic subsequently overhauled the model in 2022, changing its sepsis definition, reducing reliance on antibiotic orders, and recommending site-specific training before clinical use.
A sepsis prediction model deployed at hundreds of hospitals without independent validation missed two-thirds of cases and rang false alarms 88% of the time, because it confused having already treated sepsis with predicting it.
Key facts
- What
- The Epic Sepsis Model, a proprietary sepsis prediction algorithm embedded in Epic's electronic health record platform and deployed at hundreds of US hospitals, was found to miss 67% of sepsis cases while generating 88% false alarms in an independent external validation published in JAMA Internal Medicine in June 2021.
- Incident date
- Jun 21, 2021
- Who
- Epic Systems
- Failure mode
- Hallucination
- AI surface
- Copilot
- Severity
- High
What happened
The Epic Sepsis Model was deployed as a built-in feature of Epic's electronic health record platform at hundreds of US hospitals to provide early warning of sepsis onset. In June 2021, researchers at the University of Michigan published an external validation study in JAMA Internal Medicine showing the model missed 67% of sepsis cases (sensitivity 33%) and produced false alarms 88% of the time (positive predictive value 12%) at the standard alert threshold. The model's area under the ROC curve was 0.63, far below the 0.76 to 0.83 range Epic had claimed. Following the study and sustained criticism, Epic overhauled the algorithm in October 2022, changing the sepsis onset definition to a more commonly accepted standard, reducing reliance on clinician antibiotic orders, and recommending that hospitals train the model on their own data before clinical deployment.
What broke inside the model
- 01 · TriggerA user asks for a fact, a citation, or a figure.
- 02 · Model stepThe model writes a fluent, confident answer.
- 03 · Control gapNothing ties the claim back to a real source.
- 04 · FailureA fabricated fact ships as if it were verified.
- 05 · ConsequenceThe false claim reaches a customer, a court, or the public.
Confidence holds, and even spikes, as the claim detaches from any source.
The Epic Sepsis Model suffered from poor calibration and discrimination because it relied on clinician orders for antibiotics as a key prediction variable, which meant it often fired alerts after clinicians had already recognized and treated sepsis, producing late and redundant warnings. The model was deployed widely without independent external validation, and its proprietary nature prevented hospitals from auditing its performance against their own patient populations. The gap between Epic's internally reported AUC of 0.76 to 0.83 and the externally validated AUC of 0.63 revealed that the model had been overfitted to its training data and failed to generalize to real-world clinical settings.
What it cost
Sources
- PrimaryExternal Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patientsjamanetwork.com
- PressEpic's overhaul of a flawed algorithm shows why AI oversight is a life-or-death issuestatnews.com
- PressEpic's widely used sepsis prediction model falls short among Michigan Medicine patientsfiercehealthcare.com
Cite this entry
https://failureindex.ai/failures/epic-sepsis-prediction-model-missed-twoAI Failure Index. "Epic's sepsis prediction model missed two-thirds of cases with 88% false alarms, a study found" (FI-0095). Realm Labs. https://failureindex.ai/failures/epic-sepsis-prediction-model-missed-two (indexed Jun 4, 2026).Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0095. Full dataset at /data.
Note from Realm Labs, the Index steward
How Realm would have caught this
- Prism
- OmniGuard
- AI Detection & Response (AIDR)
A runtime layer that watches the model's internal state can flag the moment a model commits to a claim it has no support for, and hold or reroute the response before it reaches a user. Realm reads those signals in real time rather than grading the transcript after the fact.