Log In

Mirror Mirror, on the wall, who hallucinates the most of all?: Anthropic's CEO claims humans hallucinate more than AI, boasting the new model's factual reliability. - The Economic Times

Published 1 day ago3 minute read
Business NewsAIAI InsightsMirror Mirror, on the wall, who hallucinates the most of all?: Anthropic’s CEO claims humans hallucinate more than AI, boasting the new model’s factual reliability.
Mirror Mirror, on the wall, who hallucinates the most of all?: Anthropic’s CEO claims humans hallucinate more than AI, boasting the new model’s factual reliability.
The Feed
iStock-490920822iStock
CEO Dario Amodei, speaking at the VivaTech 2025 in Paris and the “Inaugural Code with Claude” developer day, claimed that AI can now outperform human beings in terms of factual accuracy in structured scenarios. He asserts in the aforementioned major tech events of this month that modern AI models, including the newly released Claude 4 series, may hallucinate at a lesser rate than most humans when answering factual and structured questions.In the context of AI, hallucination refers to when AI tools such as ChatGPT, Gemini, Copilot, or even Claude misinterpret commands, data, and context. Upon misinterpreting, it creates gaps in knowledge, wherein the AI tool begins to fill those gaps with assumptions, which aren't always factual or even real at times. Simply put, it is the generation of fabricated information.

However, with recent advancements, Amodei plants a suggestion that the situation has turned the other way around, although mostly so in conditions that can be deemed “controlled.”

During Amodei’s keynote at VivaTech, he cited Anthropic’s internal testing, where they demonstrated Claude 3.5’s factual accuracy using structured factual quizzes in competition with human participants. The test garnered results that proved a notable shift in reliability when it comes to factual precision, at least so in straightforward question-answer tasks.

He further insists on his stance, reportedly at the developer-focused “Code with Claude” event, where the Claude Opus 4 and Claude Sonnet 4 models were unveiled, that factual accuracy in AI models depends severely upon the prompt design, context, and domain-specific application. Particularly in high-stakes environments like legal filings or healthcare. He stressed this statement whilst acknowledging the recent legal dispute involving Claude’s confabulations.

ET logo

The CEO also promptly admits to not having the “hallucinations” completely eradicated and understands that the model still remains vulnerable to error but can be used with optimum accuracy with the right information fed to the model.While modern AI models like the new Claude 4 series are steadily advancing toward factual precision, especially in structured tasks, their reliability still depends on proper and careful use. As Amodei suggested, prompt design and domain context remain critical. In this ongoing competition between human intelligence and artificial intelligence, one thing is certain: it isn't merely us who hold the key to the answers; rather, we share the test with the machines.
This content is authored by a 3rd party. The views expressed here are that of the respective authors/ entities and do not represent the views of Economic Times (ET). ET does not guarantee, vouch for or endorse any of its contents nor is responsible for them in any manner whatsoever. Please take all steps necessary to ascertain that any information and content provided is correct, updated, and verified. ET hereby disclaims any and all warranties, express or implied, relating to the report and any content therein.

Read More News on

Read More News on

Stories you might be interested in

Origin:
publisher logo
Economic Times
Loading...
Loading...
Loading...

You may also like...