Back to Writing

Three gaps: coverage, synthesis, intent

Most AI insights requests get treated as a synthesis problem. That's the wrong reframe. There are three stacked gaps - coverage, synthesis, intent - and you can't skip a layer without trust collapsing underneath you.

6 min read

When someone asks an AI product for insights, they are usually handed a synthesis. “Here are the three themes from your data.” This is the wrong shape of answer, and it fails in a way that is hard to see at first because the synthesis reads beautifully. The problem is that synthesis is the middle of three stacked gaps, and a clean middle layer built on a broken bottom or pointed at the wrong top produces something confident and useless.

I learned to name the three gaps building AI study-analysis at Lyssna, where the data was raw user research and the readers were people whose decisions depended on getting it right. The gaps stack. You cannot skip one, and the layer where you fail determines how the failure looks.

Coverage: getting the data analysable at all

The bottom gap is the least glamorous and the most fatal. Coverage is the problem of getting raw, messy input into a form you can actually analyse. Research data is open-text answers, interview transcripts, half-finished sessions, the participant who misread the question, the response in a second language, the recording that cut out. Before you can find a theme you have to have read everything, parsed it, and represented it in a structure that supports counting.

The old truism still holds: garbage in, garbage out. The best synthesis in the world can’t rescue data that was never read properly to begin with. What has changed is the tool you reach for. Most coverage work is unstructured text - open-text answers, comments, reviews, support tickets - and for years the only way to process it at scale was to write code against natural-language-processing packages that were mediocre at language. The large language model is the better instrument, which shouldn’t surprise anyone: language is the thing it does best. You can have it read each response and pull out the real category, the sentiment, the detail you need, or translate it before any analysis runs, in plain language rather than through a brittle parser.

This is the first problem hey anna was built to solve. An AI formula run down every row of a spreadsheet turns a column of messy free text into something analysable: the true category, the sentiment, a clean translation across languages, before any of the actual analysis begins. That is the wrangling stage, finally done with the right tool for language.

Failure here is silent, which is what makes it dangerous. Suppose your pipeline quietly drops the responses it could not parse, and those happen to be the longest and angriest ones, because long angry text breaks parsers. Your synthesis now says “users are broadly satisfied” and it is wrong at the root. Nobody can see the error by reading the output, because the missing data left no hole on the page. The synthesis is internally flawless. It is just describing the data that survived, not the data you collected.

Most teams underinvest here because coverage work is unrewarding and invisible when it succeeds. It is also the only layer where being 90% complete can be worse than useless, because the missing 10% is rarely random.

Synthesis: rolling findings up

The middle gap is the one everyone means when they say “insights.” Synthesis takes the analysable data and rolls it into something higher-order: the themes, the patterns, the “seven of twelve participants stalled at pricing.” This is real work and models are genuinely good at the language of it.

Failure here looks like over-confident pattern-finding. The model sees three responses that rhyme and declares a theme; it weights a vivid quote over a common one; it smooths twelve messy answers into a clean narrative that none of the twelve people would recognise. The output is plausible and slightly invented, and the only defence is to keep every synthesised claim traceable to the underlying responses, so a reader can click a theme and see the answers that produced it. Synthesis you cannot trace back to coverage is just well-phrased guessing.

But here is the trap: a team that attacks synthesis first, before coverage is solid, gets a layer that works perfectly on the data it can see and lies about the data it cannot. The synthesis layer cannot detect that the floor beneath it has holes. It will confidently summarise a biased sample forever.

Intent: what the person was trying to learn

The top gap is the one that gets ignored entirely, and it sits above synthesis. Intent is knowing what the person was actually trying to learn when they asked. The same dataset answers different questions, and a synthesis aimed at the wrong question is wasted no matter how good the two layers below it are.

A product manager asking “what did we learn from this study” might mean “is the new checkout flow safe to ship,” or “which of my two designs won,” or “what objection do I take to my VP on Friday.” A generic three-theme summary serves none of these. It is correct and irrelevant, which is its own kind of failure; the reader skims it, finds nothing addressed to their actual decision, and quietly stops trusting the tool. Intent failure does not look like a wrong answer. It looks like a right answer to a question nobody asked.

There is a second half to intent. The user knows their problem better than you do, which is why you ask; but knowing the problem isn’t the same as knowing what’s findable in the data, and the most valuable insight is often the one they didn’t know to ask for. I built hey anna to work that seam: to answer the question someone came with, and to act as the analyst who points them at the signal they would have walked past. The catch is that the guiding still has to serve them. A surprising finding with nothing to do with what they care about isn’t an insight, it’s a distraction. Reading intent well means widening what they thought to look for without leaving what they actually value.

Why the order matters

Put the three together and the diagnostic falls out cleanly:

GapThe jobWhat failure looks like
Coverageget all the raw data analysablea confident summary of the data that survived
Synthesisroll findings into higher-order patternsplausible themes that smooth over or invent
Intentanswer what the user meant to aska correct answer to the wrong question

Trust collapses from whichever layer you neglected, and it collapses in a characteristic way. Neglect coverage and the answer is confidently wrong. Neglect synthesis and the answer is shapeless. Neglect intent and the answer is irrelevant. Crucially, you cannot patch a lower failure from a higher layer: no amount of synthesis brilliance rescues missing coverage, and no amount of intent-reading rescues a synthesis built on a biased sample.

The reason most AI insights products feel untrustworthy is that they ship the middle layer alone. Synthesis demos well, so it gets built first and shown first, while coverage is half-finished underneath and intent is assumed rather than asked. The fix is to build bottom-up and check top-down: secure the data, make every rolled-up claim traceable to it, and then aim the whole thing at the question the person actually came with. Skip a layer and the trust does not erode gradually. It falls through the gap you left open.