HMRC tax allowances ignored by ChatGPT and Copilot
Generative AI tools including ChatGPT and Copilot provided incorrect UK tax advice. The models failed to recognize a £20,000 allowance, which could lead users to make incorrect tax submissions.
ChatGPT and Copilot failed to notice the correct allowance was £20,000 and gave advice that could have led a consumer to oversubscribe.
Key facts
- What
- Generative AI tools including ChatGPT and Copilot provided incorrect UK tax advice.
- Incident date
- Aug 1, 2025
- Who
- OpenAI, Microsoft
- Failure mode
- Hallucination
- AI surface
- Chatbot
- Severity
- Medium
What happened
Generative AI tools such as ChatGPT and Copilot provided incorrect tax advice to UK consumers. The tools failed to identify a correct £20,000 allowance in their responses. This inaccuracy could lead users to make incorrect tax submissions and breach HMRC rules.
What broke inside the model
- 01 · TriggerA user asks for a fact, a citation, or a figure.
- 02 · Model stepThe model writes a fluent, confident answer.
- 03 · Control gapNothing ties the claim back to a real source.
- 04 · FailureA fabricated fact ships as if it were verified.
- 05 · ConsequenceThe false claim reaches a customer, a court, or the public.
Confidence holds, and even spikes, as the claim detaches from any source.
The models failed to correctly retrieve and apply specific UK tax allowance thresholds. This represents a factual hallucination where the AI provided incorrect regulatory figures.
What it cost
Sources
- PrimaryReport 6688incidentdatabase.ai
- PrimaryWhen AI financial advice goes wrong: ChatGPT, Copilot, and Gemini failed UK consumersgiskard.ai
Cite this entry
https://failureindex.ai/failures/hmrc-tax-allowances-ignored-chatgpt-copilotAI Failure Index. "HMRC tax allowances ignored by ChatGPT and Copilot" (FI-0429). Realm Labs. https://failureindex.ai/failures/hmrc-tax-allowances-ignored-chatgpt-copilot (indexed Jun 10, 2026).Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0429. Full dataset at /data.
Note from Realm Labs, the Index steward
How Realm would have caught this
- Prism
- OmniGuard
- AI Detection & Response (AIDR)
A runtime layer that watches the model's internal state can flag the moment a model commits to a claim it has no support for, and hold or reroute the response before it reaches a user. Realm reads those signals in real time rather than grading the transcript after the fact.