Observability
Logging the prompt that actually shipped
30 June 2025 · 7 min
The single highest-leverage thing we did for reliability was unglamorous: record the exact, fully-resolved prompt that was sent, every time.
Templates lie. The prompt in your code is not the prompt that shipped — by the time variables, retrieved context, and truncation have done their work, the string the model actually saw can be surprising. If you only log the template, every incident starts with a reconstruction.
Log the resolved string
Store the final rendered prompt, the model and parameters, the resolved context, and a hash that ties it to the response. Storage is cheap; a wrong answer you cannot reproduce is not.
Most of reliability is being able to ask “what did it actually see?” and get a straight answer in one query.