I've built this kind of search in a production system, and this series is my attempt to write down what stayed useful after a lot of reading, testing, and tuning. I'm keeping the examples much smaller than the real system on purpose. I want to show the shape of the solution, not dump old code.
After going through all that, my conclusion was pretty simple: for this setup, I already had most of what I needed. PostgreSQL was enough. That doesn't mean Postgres is always the answer. It just means I didn't need a new system for this case.
The search had two parts. Some queries were about meaning. Others were about exact words, IDs, error codes, or typos. Once I looked at it that way, keeping text, metadata, and embeddings in Postgres felt like the simplest option. pgvector handled meaning. pg_trgm handled fuzzy text matching.
The local stack I used
I started with Docker because I wanted the setup to be clear. If I can't explain the stack with two or three files, I usually don't understand it well enough yet.
services:
db:
image: pgvector/pgvector:pg16
ports:
- "5432:5432"
environment:
POSTGRES_DB: hybrid_search
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
volumes:
- ./init.sql:/docker-entrypoint-initdb.d/01-init.sql:ro
- pgdata:/var/lib/postgresql/data
volumes:
pgdata:
And then init.sql was just the extension setup plus the smallest table I need to explain the search.
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE TABLE knowledge_base (
id bigserial PRIMARY KEY,
content text NOT NULL,
embedding vector(1536) NOT NULL
);
I wanted it simple. One container, one volume, one init script. I used vector(1536) because that matched the embedding model I had in mind. If you're on a different model, that line changes and the rest stays the same. The real table had more fields, but I don't need them here to explain the search itself.
The table shape that held up
I used knowledge_base in the examples, but the structure fits support data and documentation too. The main decision was that each row should be one searchable unit. In practice that usually meant a chunk, a section, or a single note instead of a whole document.
I kept content because I still wanted the original text for trigram search and debugging. I kept embedding in the same row because I wanted one write path and one place to inspect when something looked wrong.
In production I had more columns and more filters, but those details aren't the point here.
You can normalize this more, and I often do later. For the first pass I preferred one table because it kept reads and writes easy to inspect.
The ingestion path stayed smaller than I expected
I overthought ingestion at first. Then I realized the basic job was simple: fetch the text, ask the embedding model for a vector, and write both together.
I'm not sharing the full ingestion code here. The useful part is just the shape:
for chunk in load_chunks():
vector = embed(chunk["content"])
db.execute(
"INSERT INTO knowledge_base (content, embedding) VALUES (%s, %s)",
(chunk["content"], vector),
)
In the real system there was more around this: chunking, retries, batching, updates, and failure handling. But the basic idea didn't change.
Why this felt better in practice
The main benefit was simple. I had one place to back up, one place to inspect data, and one place to debug when someone asked why a search result looked wrong.
That sounds minor until you actually have to answer questions like: why didn't this search find AUTH-417, or why did this document disappear after a reindex? Having the raw text and the embedding side by side removed a surprising amount of guesswork.
I also liked that this didn't force an early architecture decision. If the workload stayed moderate, Postgres was enough. If it grew later, I could add ANN indexes, tune memory, or move pieces out with actual evidence.
The limits were still real
This doesn't solve every search problem. If the data is huge, if embeddings change all the time, or if you need very low latency at large scale, a specialized system can still make sense. The same is true if your team already runs Elasticsearch or OpenSearch well.
And extension support still matters. Most decent managed Postgres offerings support pgvector and pg_trgm now, but I still check before committing to the design.
The other thing I learned is that chunking quality matters more than the database choice. Bad chunks give you bad recall anywhere. Postgres doesn't save you from that.
What this setup gave me was a clean base state: text and vectors in the same place, with a write path I could explain simply. For this kind of system, that was a good place to start.