TurboTax's Intuit Assist gave wrong tax advice on over half of test questions, the Post found

Washington Post tech columnist Geoffrey A. Fowler tested TurboTax's Intuit Assist AI chatbot with 16 tax questions and found it gave wrong or irrelevant answers on more than half. Specific failures included recommending incorrect filing statuses and fabricating irrelevant education credit advice when asked about air conditioner tax credits. Even after Intuit updated the software, the chatbot remained unhelpful on a quarter of the questions.

Intuit · Incident Mar 4, 2024 · Indexed Jun 4, 2026 · 3 sources

The AI treated a question about air conditioner tax credits as a prompt to hallucinate education credit advice, proving confident retrieval without relevance checking is dangerous in tax compliance.
What
Washington Post tech columnist Geoffrey A.
Incident date
Mar 4, 2024
Who
Intuit
Failure mode
Hallucination
AI surface
Chatbot
Severity
High

What happened

Washington Post tech columnist Geoffrey A. Fowler tested TurboTax's Intuit Assist AI chatbot with 16 tax questions during the 2024 tax season and found it gave wrong or irrelevant answers on more than half. When asked about tax credits for a new air conditioner, the chatbot responded with irrelevant information about education credits and 1098-T forms instead of the correct residential energy credit. The chatbot also failed to provide correct filing status guidance and pasted irrelevant content from community forums instead of answering specific questions. H&R Block's competing AI Tax Assist also gave unhelpful answers on over 30 percent of the same test questions.

What broke inside the model

Failure path · mode profile · Hallucination
  1. 01 · TriggerA user asks for a fact, a citation, or a figure.
  2. 02 · Model stepThe model writes a fluent, confident answer.
  3. 03 · Control gapNothing ties the claim back to a real source.
  4. 04 · FailureA fabricated fact ships as if it were verified.
  5. 05 · ConsequenceThe false claim reaches a customer, a court, or the public.

Confidence holds, and even spikes, as the claim detaches from any source.

The generative AI chatbot's retrieval and generation pipeline failed to match user questions to the correct tax code provisions, instead surfacing tangentially related content from community forums and unrelated tax topics. When asked about air conditioner tax credits, it returned irrelevant education credit information including 1098-T forms, demonstrating a fundamental failure in relevance matching and contextual understanding. The system lacked sufficient guardrails to prevent confident presentation of fabricated or mismatched tax advice.

Public visibilityHigh
Regulatory exposurePossible
Customer impactClass-wide
Financial impactUnknown
Time to disclosureDays
  1. PressTurboTax and H&R Block's AI chatbots are giving bad tax advicewashingtonpost.com
  2. PressDangers of AI-Powered Chatbot Tax Advicebankler.com
  3. PressOverreliance on AI for Tax Advice: A Cautionary Perspectivetaxexecutive.org
Permalinkhttps://failureindex.ai/failures/turbotax-intuit-assist-gave-wrong-tax
CitationAI Failure Index. "TurboTax's Intuit Assist gave wrong tax advice on over half of test questions, the Post found" (FI-0084). Realm Labs. https://failureindex.ai/failures/turbotax-intuit-assist-gave-wrong-tax (indexed Jun 4, 2026).
Share cardA branded image of this record for posts and slides.

Data fields CC-BY 4.0, prose citation permitted. Incident ID FI-0084. Full dataset at /data.

Note from Realm Labs, the Index steward

How Realm would have caught this

Controls for this failure mode
  • Prism
  • OmniGuard
  • AI Detection & Response (AIDR)

A runtime layer that watches the model's internal state can flag the moment a model commits to a claim it has no support for, and hold or reroute the response before it reaches a user. Realm reads those signals in real time rather than grading the transcript after the fact.