AI Fact-Checking Trustworthiness Examined

Published 1 day ago• 3 minute read

Since Elon Musk's xAI launched Grok in November 2023, and especially after its wider release in December 2024, many X users have been using it for quick fact-checks. However, a recent TechRadar survey revealed that 27% of Americans prefer AI tools like ChatGPT or Gemini over traditional search engines. This raises concerns about the accuracy and reliability of these chatbots, especially after Grok's controversial statements on 'white genocide' in South Africa. The incident, where Grok discussed the topic unsolicited, led xAI to blame an "unauthorized modification" and conduct an investigation. This situation underscores the question of how reliable AI chatbots are for fact-checking.

Two studies, by the BBC and the Tow Center for Digital Journalism, highlight significant shortcomings in AI's ability to accurately report news. The BBC study found that 51% of AI-generated answers contained inaccuracies or distortions, with 19% introducing factual errors and 13% altering or fabricating quotes. Pete Archer from the BBC concluded that AI assistants are not reliable for accurate news.

Similarly, the Tow Center for Digital Journalism found that AI search tools failed to correctly identify the source of article excerpts in 60% of cases. Grok performed particularly poorly, answering 94% of queries incorrectly. The study also noted the "alarming confidence" with which AI tools presented incorrect information, fabricating links and citing syndicated versions of articles.

The reliability of AI chatbots is closely tied to their data sources. Tommaso Canetta from Pagella Politica and EDMO points out that the quality and accuracy of AI responses depend on how they are trained and the trustworthiness of their data sources. He warns about the "pollution of LLMs by Russian disinformation and propaganda," emphasizing that untrustworthy sources lead to unreliable answers. The political alignment of figures like Elon Musk also raises concerns about potential biases in the AI's data.

Instances of AI errors are not uncommon. Meta AI once posted parenting advice on Facebook, falsely claiming to have a disabled child. Grok also misinterpreted a joke about a basketball player, falsely reporting that he was under investigation for vandalism. In another instance, Grok spread misinformation about the deadline for US presidential nominees, prompting a letter of complaint from Minnesota's Secretary of State.

AI chatbots also struggle with identifying AI-generated images. A DW experiment revealed that Grok misidentified the date, location, and origin of an AI-generated image of a fire, even recognizing a TikTok watermark without questioning its authenticity. Similarly, Grok incorrectly identified a viral video of a large anaconda as real, despite it clearly being AI-generated and containing a ChatGPT watermark.

Experts caution against viewing AI chatbots as definitive fact-checking tools. Felix Simon from the Oxford Internet Institute advises that while AI can be used for fact-checking, its performance is inconsistent. Tommaso Canetta suggests that AI is useful for simple fact-checks but advises double-checking with other sources. Both experts emphasize the need for users to verify AI-generated information independently.

From Zeal News Studio(Terms and Conditions)