back

CoT May Be Highly Informative Despite “Unfaithfulness”

Source

Published

Oct 12, 2025

Share On

Get SIGNAL/NOISE in your inbox daily

Recent work from Anthropic and others claims that LLMs’ chains of thoughts can be “unfaithful”. These papers make an important point: you can’t take everything in the CoT at face value. As a result, people often use these results to conclude the CoT is useless for analyzing and monitoring AIs. Here, instead of asking whether the CoT always contains all information relevant to a model’s decision-making in all problems, we ask if it contains enough information to allow developers to monitor models in practice. Our experiments suggest that it might.

CoT May Be Highly Informative Despite “Unfaithfulness”

Recent Stories

Stop ignoring AI risks in finance, MPs tell BoE and FCA

OpenAI CFO Friar: 2026 is year for ‘practical adoption’ of AI

OpenAI’s 2026 ‘focus’ is ‘practical adoption’