Why Vector Databases Matter
RAG systems depend on fast, accurate vector retrieval. The choice of vector database shapes latency, cost, scalability, and operational complexity. The wrong choice produces slow responses, high bills, or unreliable results.
For RAG systems, recall above 0.90 is typically sufficient; the LLM can compensate for missing context, but not for completely wrong context.
We evaluate vector databases on four dimensions: performance (latency, throughput, recall), cost (infrastructure, licensing, operational), operational complexity (self-hosted vs managed, scaling, backup), and ecosystem (integrations, community, support). No database wins on all dimensions; the choice is always a trade-off.
The Options Compared
Chroma
Chroma is an open-source embedding database designed for simplicity. It runs locally, stores embeddings in SQLite or DuckDB, and provides a Python-first API that data scientists find intuitive.
Pinecone
Pinecone is a managed vector database with serverless deployment, automatic scaling, and strong metadata filtering. It abstracts away infrastructure entirely — you send embeddings, Pinecone stores and retrieves them.
Weaviate
Weaviate is an open-source vector database with GraphQL interface, hybrid search (vector + keyword), and built-in vectorisation modules. It supports both self-hosted and managed (Weaviate Cloud) deployment.
pgvector
pgvector is a PostgreSQL extension that adds vector storage and similarity search to the world's most trusted relational database. It supports exact and approximate nearest neighbour search using IVFFlat and HNSW indexes.
Performance Benchmarks
Based on our internal testing with 10M vectors (768-dimensional, cosine similarity):
- Query latency (p95): Pinecone ~15ms, Weaviate ~25ms, pgvector (HNSW) ~40ms, Chroma ~120ms
- Recall @ top-10: Pinecone 0.97, Weaviate 0.95, pgvector 0.94, Chroma 1.0 (exact search)
- Index build time (10M vectors): Pinecone < 1 hour (managed), Weaviate ~2 hours, pgvector ~4 hours, Chroma ~8 hours
- Throughput (queries/sec/node): Weaviate ~8,000, pgvector ~3,000, Chroma ~500, Pinecone (managed) ~unlimited
Decision Framework
Choose Chroma if: You are prototyping, have under 1M vectors, and need something running in under an hour.
Choose Pinecone if: You want zero operational burden, have variable or unpredictable scale, and cost is secondary to velocity.
Choose Weaviate if: You need hybrid search, want open-source with managed option, or have complex metadata filtering requirements.
Choose pgvector if: You already use PostgreSQL, need transactional consistency between vectors and relational data, or operate in a regulated environment.
Our Recommendation
Start with pgvector if you use PostgreSQL already. It adds vector capability without new infrastructure and handles most production workloads. Move to Weaviate when you need hybrid search or graph queries. Use Pinecone only when operational simplicity outweighs cost concerns.