Back to Writing

Metacognition is the unlock

Model progress has moved through paradigms: reactive, then reasoning, then agentic. The next is metacognition - thinking about its own thinking - and it's what separates a model that repeats mistakes from one that compounds on them.

4 min read

The big challenge for traditional LLMs is that they are path-dependent; while they can consider the puzzle as a whole, as soon as they commit to a particular guess they are locked in, and doomed to failure. This is a fundamental weakness of what are known as “auto-regressive large language models”, which to date, is all of them.

That’s Ben Thompson. Escaping that trap is what each paradigm of model progress has chipped at, and the paradigm you’re on sets the ceiling on what the model can do without you standing over it.

The first paradigm was reactive. Early chat models answered in one shot: prompt in, text out, no pause between. They were fluent and often right, but they couldn’t catch themselves mid-thought because there was no middle to the thought. The whole answer arrived at once.

The second was reasoning. Models learned to think step by step before answering; to lay out the working, not just the conclusion. This is a real jump. A model that reasons through a problem catches errors a reactive model commits on the spot, because the steps are now visible to the model as it produces them. Reasoning buys reliability on anything that has a chain to it.

The third is agentic. Harnesses and agents decompose a goal into steps and sequence the work: do this, then that, check the result, move on. This is what turns a model that answers questions into a system that completes tasks. It can hold a goal across many calls and make progress against it without a human threading each step.

The next paradigm

That paradigm is metacognition: thinking about its own thinking. Not reasoning through the task, but reasoning about the approach to the task. Noticing that a line of attack is failing and changing it. Noticing it’s hit the same blocker three times and trying a different approach. Treating a failed attempt as information about strategy rather than a thing to retry verbatim.

The distinction is sharp once you see it. An agentic loop that hits an error retries the step, maybe with a tweak, and retries again. It’s working hard inside a frame it never questions. A metacognitive agent asks a different question: is the frame wrong? It can question the step itself, judge the strategy that produced it, and pick a new one. The first can loop forever. The second can notice it’s looping.

In humans this is the whole game. Two people make the same mistake. One files it as bad luck and makes it again next week. The other extracts the rule, updates how they work, and never makes that class of mistake the same way twice. The second person isn’t smarter in the moment; they’re better at learning from the moment. Metacognition is what separates someone who repeats mistakes from someone who compounds on them. The mistakes are the same. What you do with them isn’t.

Where you trigger it

An agent loop usually has an observe step, a decide step, an act step. Metacognition adds a fourth: learn. That’s the seam where metacognition lives, and the useful part is that you can trigger it deliberately rather than wait for it to emerge.

After an attempt, before the next one, you make the agent answer a different kind of question. Not “what’s the next action” but “what did that attempt tell me, and should I change my approach?” Cheap to add: it’s a prompt and a place in the loop to run it. The effect is out of proportion to the cost. The agent stops treating every failure as a reason to retry and starts treating some failures as a reason to rethink. You’re not making the model smarter. You’re giving it a moment to be honest about whether the current plan is working, and a license to abandon it.

Current models don’t fully have this. They’ll cheerfully retry a doomed approach, declare success on work that failed, miss the pattern in their own errors. The paradigm isn’t here yet; it’s arriving. But you can see the shape of it, and you can scaffold toward it in the loops you build today. The learn step is where you reach for the next paradigm before the models get there on their own.

The reactive model answers. The reasoning model works through. The agentic one sequences. The metacognitive one notices it’s about to make the same mistake again, and doesn’t.