The error surface
When the model fails, don't pretend it didn't.
Every AI product fails, frequently, in the middle of a response. The tool times out. The rate limit trips. The policy filter bites. The question isn't whether failures happen. It's whether the user can see the failure, understand it, and act on it in five seconds.
A generic 'something went wrong' teaches users one thing: don't trust this product. A typed error with a named class, a recovery action, and a trace id teaches the opposite. Errors are the cheapest way to build trust if you treat them as a surface, not an afterthought.
"An error the user can act on in five seconds is a feature. Everything else is a support ticket in disguise."
Named class. Recovery action. Trace.
Four parts. First, a class label at the top (rate_limit, timeout, refused, tool_fail). Second, a one-sentence headline that describes what actually happened in plain words. Third, a primary recovery action, specific to that class. Fourth, a copyable trace id in a muted monospace line at the bottom, so the user can report or debug.
The tool took longer than 30s and I cut it.
I killed the 'query_warehouse' call. Your prompt is safe. You can retry, or run the same prompt against the cached snapshot (2h old).
Failure class, recovery action, and a copyable trace id. The user can act in five seconds.
Most error copy is a legal apology.
Default error states try to sound neutral, which means they say nothing. But errors are where trust is most elastic. A specific error, delivered well, makes the product feel robust. A vague error, delivered defensively, makes the product feel fragile. The content is the same shape either way.
Errors as a UI.
- Type the failure. Every error carries a class. The user shouldn't have to guess whether retrying will help.
- Primary action, not a menu. One clear recovery. 'Retry in 42s' beats 'learn more, retry, cancel, details' three ways.
- Expose the trace. A copyable trace id makes every error a bug report the user can file in three seconds. Your team wins every time.
Errors that look like successes.
The worst failure mode isn't an ugly error. It's an error that doesn't exist: the response silently truncates, the tool call is skipped, the model pivots to a different answer. The user assumes the product is fine. They only find out when a colleague asks why the number doesn't match.
What this pattern gets wrong when it gets wrong.
- Ambiguous state
- Running, done, errored, paused all look the same. The user has to infer from context.
- Silent truncate
- The response ran out of room or tokens and the product didn't tell the user where it stopped.
- Phantom tool
- A visible tool call that didn't happen, or happened but with different arguments than shown.
Three shipping variants worth copying.
- A typed error card with a failure class and recovery action
- Rate-limited vs. timed-out vs. refused are visually distinct
- A 'copy diagnostic' button that includes the trace id