xAI Exec Says Have Grok Check Your Taxes. Here’s How That Could Go Very Wrong
xAI's general counsel suggested using Grok to check taxes, but tests show chatbots miscalculate refunds by over $2,000 on average.
xAI's General Counsel, James Burnham, ignited controversy by advising X users to have Grok, the company's AI chatbot, double-check their tax returns, citing an anecdote where it supposedly increased a friend's refund by $1,400. This recommendation comes despite the well-documented limitations of general-purpose large language models (LLMs) in handling complex, regulated financial calculations. The suggestion was immediately met with skepticism from tax professionals and contradicted by existing benchmarks, highlighting a significant gap between AI marketing hype and practical, reliable application in high-stakes domains.
Independent testing underscores the dangers. The New York Times, using scenarios from TaxSlayer, found leading chatbots miscalculated tax refunds and amounts owed by an average of more than $2,000. Furthermore, the TaxCalcBench evaluation shows most AI models fail to achieve even 50% accuracy on a full tax return. Beyond accuracy, experts like Joel Salas of Elevated Tax Strategies warn of severe data privacy risks, noting that user prompts—containing highly sensitive financial data—are often used for model training unless manually opted out. Companies like xAI, OpenAI, and Anthropic have faced scrutiny for data handling practices, with past incidents revealing private user conversations. Intuit, the maker of TurboTax, emphasized that tax preparation requires purpose-built systems for accuracy and compliance, not general-purpose LLMs.
- NYT testing found AI chatbots miscalculated tax refunds by over $2,000 on average using real tax scenarios.
- The TaxCalcBench benchmark shows most AI models fail to achieve 50% accuracy on a full tax return calculation.
- Tax and privacy experts warn against sharing sensitive W-2/1099 data with chatbots due to training data use and past privacy leaks.
Why It Matters
Using unvetted AI for taxes risks significant financial penalties from IRS errors and exposes your most sensitive personal data.