AI search tools fail on 60% of news queries, Perplexity best performer

Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage

Join Now

Imagine a vivid dream…of a fake URL.

Generative AI search tools are proving to be alarmingly unreliable for news queries, according to comprehensive new research from Columbia Journalism Review’s Tow Center. As roughly 25% of Americans now turn to AI models instead of traditional search engines, the implications of these tools delivering incorrect information more than 60% of the time raises significant concerns about public access to accurate information and the unintended consequences for both news publishers and information consumers.

The big picture: A new Columbia Journalism Review study found generative AI search tools incorrectly answer over 60 percent of news-related queries, raising serious concerns as Americans increasingly adopt these tools as search engine alternatives.

Key details: Researchers tested eight AI-driven search tools by asking them to identify headline, publisher, publication date, and URL from direct news article excerpts.

The study included 1,600 queries across eight different generative search platforms.
Instead of declining to respond when information was unavailable, most tools confabulated plausible-sounding but incorrect answers.

Important stats: Error rates varied dramatically among the platforms tested in the research.

Perplexity provided incorrect information in 37 percent of queries.
ChatGPT Search incorrectly identified 67 percent (134 out of 200) of articles.
Grok 3 demonstrated the worst performance, with a 94 percent error rate.

Behind the numbers: Surprisingly, premium paid versions of these AI search tools often performed worse than their free counterparts.

Perplexity Pro ($20/month) and Grok 3’s premium service ($40/month) delivered incorrect responses more confidently than free versions.
More than half of citations from Google’s Gemini and Grok 3 led to fabricated or broken URLs.

Why this matters: The research identified serious technical violations that could impact both publishers and information consumers.

Evidence suggests some AI tools ignored Robot Exclusion Protocol settings, which publishers use to prevent unauthorized access.
URL fabrication was common, creating an illusion of credibility through citations that don’t actually exist.

AI search engines give incorrect answers at an alarming 60% rate, study says

Ars Technica

Menu

AI search tools fail on 60% of news queries, Perplexity best performer

Recent News

IAG’s AI system cuts aircraft maintenance planning from weeks to minutes

Trump secures China rare earth deal while escalating AI competition

Coatue research reveals AI is creating a “great separation” between winners and losers

Join the revolution

CO/AI

Resources

Join the revolution

Menu

Welcome

AI search tools fail on 60% of news queries, Perplexity best performer

Recent News

IAG’s AI system cuts aircraft maintenance planning from weeks to minutes

Trump secures China rare earth deal while escalating AI competition

Coatue research reveals AI is creating a “great separation” between winners and losers

Join the revolution

CO/AI

Resources

Join the revolution