Not every AI feature should be a chat

Enterprises trust AI for invisible categorisation and distrust it for reversible work behind a chat box. 'Chat, move this five pixels' is all-or-nothing and risky. Sometimes the right surface for an AI feature is a button.

15 October 2025 6 min read

The same enterprise that lets AI silently categorise a million support tickets will refuse to let an AI chat assistant touch the layout of one slide. From the outside that looks like inconsistency. It isn’t. It’s a precise read on where AI is safe to depend on and where it isn’t, and the chat box is on the wrong side of that line more often than chat-first design assumes.

Watch what enterprises actually approve. They hand AI the invisible, high-volume work: routing tickets, tagging transactions, flagging anomalies, deduplicating records. Millions of decisions a day, no human in the loop, complete trust. Then watch what they hold back. The hands-on, reversible work a person was doing themselves a minute ago: editing the document, adjusting the design, changing the number in the cell. Here they want their hands on the controls. The pattern isn’t fear of AI. It’s a sober judgement about which surface fits which task.

Two questions decide the surface

The surface for an AI feature should fall out of two properties of the task, not out of what’s fashionable to ship.

The first is how much trust the task requires. Categorisation is forgiving at scale; one misrouted ticket out of a million is noise, and the aggregate is what matters. Moving an element on a customer’s slide is unforgiving and singular; there is no average to hide in, and the one wrong move is the whole experience. High-trust, low-tolerance tasks want the user’s hand on the wheel. Low-trust, high-volume tasks are exactly where you let the AI run unattended.

The second is how reversible the action is. A categorisation that runs in the background is reversible by definition; you re-run it, you correct the tag, nothing was staked. A direct manipulation of something the user is actively working on carries immediate, visible consequences they have to live with. The more reversible and invisible the work, the more autonomy the AI can safely have. The more direct and consequential, the more the user wants a control they understand.

Where chat earns its place, and where it doesn’t

Chat is a genuinely good surface for a specific shape of task: open-ended, exploratory, where the user doesn’t yet know what they want and the cost of a wrong turn is just another message. “Help me think through this,” “what’s in this dataset,” “draft me three options.” The ambiguity is the point, and a conversation is the right tool for resolving ambiguity.

The fashionable AI design tools show the failure of forcing everything else into that mould. “Chat, move this box five pixels left” is slower, vaguer, and riskier than the direct control it replaced. You type a sentence, wait for a generation, and discover whether the model understood; the old way was to grab the box and move it, with your eyes closing the loop in real time. Worse, the interaction is all-or-nothing. You get the model’s whole interpretation back at once, and if it’s 90% right you’re now editing its guess instead of expressing your intent. A precise, reversible, hands-on task got wrapped in an imprecise, latent, all-or-nothing interface. The chat didn’t add power. It added distance between the user and the thing they were trying to do.

How wide is the space?

Those two questions are about how much autonomy a task can bear. There’s a third, on a different axis, and it decides how much an LLM is actually buying you: how wide is the space of inputs and outputs the feature has to cover?

A traditional control is something you build by anticipating that space in advance. Every option, every edge case, every state has to be foreseen and given a component, and what you didn’t build, the user can’t do. That foresight is most of the cost of software, and it’s why narrow, well-understood tasks get clean controls and sprawling, open-ended ones get a thin UI or none at all. An LLM broadens the interface without you building any of it: it takes an input you never enumerated and produces an output you never designed a screen for. That is the real trade. When the space is too wide to enumerate, you couldn’t have built a control for every way a person might ask for the chart they want, so you let language carry the range. When the space is narrow, the model is selling you breadth you don’t need, and charging latency, a stochastic result, and an all-or-nothing round-trip for it.

And the answer isn’t fixed for the life of a feature; it moves through a single task. Generation is wide: you don’t know exactly what you’ll get, so the open interface fits. Editing is narrow: you know the precise change, and running it back through the stochastic round-trip to move one line or delete one word is slower and less certain than simply doing it yourself. The same feature wants a broad interface to create and a direct one to refine, and the good ones switch at that seam rather than trapping the user in a conversation for work their hands could finish in a second.

A decision rule

Before defaulting an AI feature to chat, run it through four questions.

Question	Lean to chat	Lean to a control
Does the user know what they want?	No, they’re exploring	Yes, they have a specific intent
How direct is the action?	Indirect, the AI does work offstage	Direct, the user is manipulating the thing
What’s the cost of a wrong interpretation?	Low, just send another message	High, you’re now editing a bad guess
How wide is the input and output space?	Wide, you can’t enumerate it	Narrow, you can build for it

When the answers point right, the better surface is a button, a toggle, a slider, or an inline suggestion the user accepts or rejects in place. A button has no ambiguity to misread. A toggle is instantly reversible. An inline suggestion lets the user see the proposed change against the real thing and take it or leave it without a round-trip through a sentence. These surfaces give the AI exactly as much autonomy as the task can bear and no more, which is the whole game.

Match the surface to the task

The chat-first default treats conversation as the universal interface for AI, and for a real slice of tasks it is the right one. The mistake is reaching for it everywhere, including the places where a direct control would be faster to use, safer to trust, and easier to reverse. The surface is not a branding decision. It’s where the trust the task demands meets the reversibility the action allows, and that meeting point is sometimes a conversation and sometimes a button.

Pick the surface the task is asking for. Often it’s quietly asking for a button.