What the planning domain taught me about retrieval
We spent months tuning embeddings before realising the problem was that our documents were the wrong shape.
UK planning documents are long, cross-referencing, and full of clauses that only mean something in the context of the policy they amend. Chunking them on token count gave us retrieval that was technically relevant and practically useless — fragments that cited a rule without the rule.
Shape beats similarity
The fix was structural, not statistical. We chunked along the document’s own hierarchy — policy, clause, amendment — and attached the parent context to every child. Retrieval quality jumped more from that than from any embedding-model change we tried.
Build the eval before the index
The thing that actually moved the needle was a small, honest evaluation set: real questions, with the passages an expert would expect back. Once we could measure retrieval, every change became a decision instead of a guess.