Lab 3 — Build a Book Recommendation RAG Engine

Summary — what this page covers Attendees build /api/recommend: ingest book data, embed and store it in Qdrant, retrieve by similarity, and return grounded recommendations — committed. Provide the free Ollama path inline so no one is blocked on an OpenAI key.

Duration: 40 min · Deliverable: working /api/recommend with grounded responses — committed

Part A — Vector store + embeddings (≈15 min)

Start Qdrant and create the collection with the vector size set to your embedding model's dimensions (768 for nomic-embed-text). Implement EmbeddingService — OpenAI or the free Ollama nomic-embed-text path. Then ingest the book data, chunk it, embed each chunk, and upsert.

docker run -p 6333:6333 qdrant/qdrant

If the collection's vector size doesn't match the model's output dimensions, upserts fail — this is the #1 gotcha. Match them.

Part B — Retrieval + augmented prompt (≈15 min)

Implement RagService: embed the query → top-K cosine search in Qdrant → build an augmented prompt injecting the retrieved chunks as context → call Claude. Wire POST /api/recommend.

Part C — Evaluate (≈10 min)

Run the same query with and without retrieved context and compare the answers — the grounded one cites real books, the ungrounded one guesses. Then tune chunk size/overlap and re-run: watch the retrieval quality (and the answer) change. Reinforce the section's thesis: quality comes from chunking, not the LLM.

Checkpoint

Qdrant collection created with matching vector size
Book data ingested, chunked, embedded, and stored
/api/recommend returns recommendations grounded in retrieved context
You compared grounded vs ungrounded output
Committed to your fork

Bonus — Hybrid Search & Evaluation

No time pressure. Add hybrid search — combine keyword matching with vector similarity so exact title/author hits rank alongside semantic matches — and a small eval harness that scores retrieval quality (did the right chunks come back?) so you can measure your chunking tweaks instead of eyeballing them.