Program syllabus
Hands-on retrieval engineering for teams who have moved past “hello-world RAG” and need systems that stay relevant, reliable, and measurable in production.
A short overview of the program: who it's for, what we cover, and how to get the most value out of it as a busy professional.
Use this to design a retrieval stack that is boring, reliable, and measurable — even as your content and models change.
Module 1
Module 2
Module 3
We make the data flow explicit so everyone can see where failures can happen — and how to instrument them.
We start with your current assistant or RAG prototype and map how data actually moves: from sources, through ingestion and indexing, into queries, ranking, and responses. Then we compare that to the flows you'd want in production.
We sketch a simple diagram that shows how queries flow through your stack:
This becomes the reference diagram for both engineering and non-technical stakeholders when you discuss reliability or new features.
We design content pipelines around how people actually ask questions, not just how documents are stored today.
Most RAG failures start before a single query is run. We focus on units of retrieval: what exactly do you want the model to see, and how do you make that unit easy to find?
Together we define a target schema for your retrieval store (vector, search index, or both):
This schema guides ingestion work and avoids one-off pipelines for every new content source.
We go beyond "top-k from the vector store" to a layered retrieval strategy that balances relevance, latency, and cost.
We wire retrieval into tests and dashboards so you can safely iterate on prompts, models, and indexes.
Retrieval quality drifts over time as content, users, and models change. We design lightweight guardrails so you catch issues before customers do.
We assemble a small evaluation workbook for one of your assistants:
We typically run this as a 1–2 week engagement anchored on one high-value assistant or workflow. You bring real data and failure modes; we bring patterns, templates, and a ruthless focus on production behavior.
By the end, you'll have a retrieval design, evaluation plan, and observability story your team can own and iterate on.
This pairs especially well with Workflow AI Design Blueprint and AI Assistant Observability & SLOs for a full stack of reliability-focused AI capabilities.