Xceed Imagination
← Back to AI Insights
RAGLLMs
Tutorial · 7 min read

Hybrid Search Beats Pure Vector Search. Here's the Proof.

BM25 + semantic + reranking. We benchmarked it on 4 client datasets. The numbers aren't even close.

If your RAG system uses only vector search, you're leaving accuracy on the table. We benchmarked hybrid search against pure vector search on 4 real client datasets. The results were decisive.

The test setup

Four datasets: healthcare policies (500 docs), legal contracts (1,200 docs), manufacturing SOPs (300 docs), and a product knowledge base (2,000 docs). 50 test queries per dataset, human-graded for relevance.

Results

  • Pure vector search: 71% precision@5 average across datasets
  • BM25 only: 65% precision@5 (worse on semantic queries, better on keyword-heavy queries)
  • Hybrid (BM25 + vector): 82% precision@5
  • Hybrid + reranking: 89% precision@5

Hybrid + reranking beat pure vector by 18 percentage points. That's the difference between a useful system and a frustrating one.

Why hybrid works

Vector search is great at semantic similarity but terrible at exact matches. When a user searches for "HIPAA Section 164.512(a)", vector search returns vaguely related privacy documents. BM25 returns the exact section.

Combining both gives you semantic understanding AND keyword precision. Reranking then sorts the combined results by actual relevance.

Our production setup

We use a weighted combination: 0.3 BM25 + 0.7 semantic, followed by Cohere Rerank v3. The weights were tuned per dataset — legal documents need higher BM25 weight (more keyword-heavy queries), while general knowledge bases lean semantic.

Implementation tip

Most vector databases now support hybrid search natively (Pinecone, Weaviate, Qdrant). You don't need to build this from scratch. The reranking step is the one most teams skip — and it's where the biggest accuracy gain comes from.

Written by the Xceed AI team. Talk to us →