-

Do LLMs really “show their work” when they perform chain of thought reasoning? “Measuring Faithfulness in Chain-of-Thought Reasoning” is a new paper from Anthropic that aims to study this question empirically with a series of tests.

Timestamps:
00:00 – Measuring Faithfulness in Chain-of-Thought Reasoning
00:53 – What is Chain-of-Thought reasoning?
03:15 – Do the Chain-of-Thought Steps Really Reflect the Model’s Reasoning?
07:03 – Possible Faithfulness Failures
08:44 – Encoded Reasoning/Steganography
12:01 – Experiment Details
15:44 – Does Truncating the Chain of Thought Change the Predicted Answer?
16:53 – Does Editing the Chain of Thought Change the Predicted Answer?
17:14 – Do Uninformative Chain of Thought Tokens Also Improve Performance?
18:28 – Does Rewording the Chain of Thought Change the Predicted Answer?
20:20 – Does Model Size Affect Chain of Thought Faithfulness?
22:04 – Limitations
24:38 – Externalized Reasoning Oversight

Topics: ##ai #anthropic #CoT #reasoning

Link to the paper: https://www-files.anthropic.com/production/files/measuring-faithfulness-in-chain-of-thought-reasoning.pdf

For related content:
– Twitter: https://twitter.com/SamuelAlbanie
– Research lab: https://caml-lab.com/
– personal webpage: https://samuelalbanie.com/
– YouTube: https://www.youtube.com/@SamuelAlbanie1
– TikTok: https://tiktok.com/@samuelalbanie
– Instagram: https://instagram.com/samuelalbanie
– LinkedIn: https://www.linkedin.com/in/samuel-albanie
– Threads: https://www.threads.net/@samuelalbanie
– Discord server for filtir: https://discord.gg/FV97NByG2b

(Optional) if you’d like to support the channel:
– https://www.buymeacoffee.com/samuelalbanie
– https://www.patreon.com/samuel_albanie

Credits:
Image credit (Chelsea photo) https://en.wikipedia.org/wiki/2004%E2%80%9305_Chelsea_F.C._season#/media/File:Champions_2004-5.jpg

source

7 Replies to “Measuring Faithfulness in Chain-of-Thought Reasoning”

Muhammad Emad Sarwar says:

July 26, 2023 at 12:01 pm

I really appreciate the work you're doing, it is very interesting to see someone go over research regarding AI. I was wondering if you're interested in networks or you know someone who is, I want to know what areas of research are being probed by researchers these days in networks. Additionally, Is there a way I could reach out to you or professors regarding different research ideas and maybe develop on them as well.

Jobob Miner says:

July 26, 2023 at 12:01 pm

Thanks again. Very informative video. I don't see much of the actual thought process of the AI research on a lot of AI news so this is a great insight

張安邦 says:

July 26, 2023 at 12:01 pm

If this trend keeps. We actually have a chance at surviving superhuman AGI. Good news

Ster says:

July 26, 2023 at 12:01 pm

Why would adding … simulate more compute time? Isn't the amount of compute the same per token, regardless of how much context comes before?

Bryan Nsoh says:

July 26, 2023 at 12:01 pm

Very enlightening Samuel. Thank you immensely! Given these insights on chain of thought reasoning, show would you specify custom instructions for GPT4 (using the new custom instructions feature) to ensure it always outputs an optimally reasoned answer?

Omar Abul-Hassan says:

July 26, 2023 at 12:01 pm

Great video!

Artūras Paleičikas says:

July 26, 2023 at 12:01 pm

nice, thanks!