The thinking indicator
Everything you put on screen between enter and the first token.
Every AI product has a gap between enter and the first streamed token. Sometimes it's 400ms, sometimes it's 14 seconds. What you put on the screen during that gap is the clearest signal in the whole product about how seriously you've thought about your user.
Dots are a polite lie. They tell the user nothing, but reassure them something is happening. At 400ms, fine. At 4 seconds, the lie starts to hurt. At 14 seconds, it's malpractice.
"A three-dot shimmer is a polite lie. A six-word summary of what the model decided to do is a product."
A précis, not a spinner.
The pattern I want is a single line of text, in the model's voice, that names what it is currently doing. Not "thinking." Not "working on it." A specific verb and a specific object. "Reading the PRD, checking for scope drift against stated goals." Six to twelve words, maximum.
The précis does three things at once. It confirms the model understood the question. It sets expectations for the kind of answer. And it earns the wait by showing the wait is being spent on the right thing.
Press run to see how each thinking indicator lands.
When to show what.
- 0 to 400ms: Nothing. The composer collapses. The cursor moves to the response. No indicator is needed at this scale — the user will see the first token before they've finished reading an indicator anyway.
- 400ms to 2s: A single-line précis in italic, in the model's voice. No animation. Nothing bouncing.
- 2s to 8s: Keep the précis. Add an elapsed counter, small, monospace, on the right. Add a cancel affordance. Never a retry.
- Past 8s: The précis should now update with a second sentence. "Still reading the PRD. This one is longer than most." The update itself is the reassurance.
Honesty is cheap, theater is expensive.
The reason thinking indicators go wrong is that teams treat them as branding surfaces. A bouncing mascot. A custom shimmer animation. A gradient orb that pulses. These are all attempts to turn latency into personality, and they all fail for the same reason: they compete with the answer for the user's attention.
A thinking indicator is a servant, not a performance. Its job is to keep the user oriented while the model works, then get out of the way. Typographic treatment carries everything. Animation, almost nothing.
What this pattern gets wrong when it gets wrong.
- Latency lie
- The interface pretends speed the backend doesn't have. Spinners that bounce faster than the real throughput.
- Phantom tool
- A visible tool call that didn't happen, or happened but with different arguments than shown.
- Confidence theater
- Language or typography that performs certainty beyond what the model actually has.
Three shipping variants worth copying.
- A single-line précis of the chosen approach
- An ambient elapsed counter, shown only past 4s
- A cancel affordance that actually cancels