Why RAG in Agents
Agents need current, correct data. RAG retrieves grounding before generation. Without it, agents hallucinate — especially on customer data. With good RAG, agents ground responses in authoritative sources.
Chunking
Chunks too small = missing context. Too large = irrelevant content in prompt. Sweet spot for CRM knowledge: 500–800 tokens with 20% overlap. Preserve heading and section boundaries where possible.
Hybrid Search
Pure vector similarity misses exact-match needs. Hybrid = BM25 + vector with reranking wins consistently. Weaviate supports native hybrid; others need layered approaches.
Measurement
Retrieval quality gates generation quality. Build an eval set — queries paired with ideal passages. Recall@5 and MRR track performance. Re-baseline when you change embedding model or chunking strategy.