PhyseaWiki How AI actually works Papers physea.ai →

RAG & retrieval

How do embeddings make RAG retrieval work?

Embeddings turn text into vectors so passages with similar meaning land near each other. Documents and queries must use the same model, and retrieval becomes a nearest-neighbor search for the top-K closest chunks.

Last updated 2026-06-15 · Physea Labs

The machinery that makes retrieval work is embeddings: turning text into vectors so that passages with similar meaning land near each other in space. You must embed your documents and your query with the same model for the comparison to make sense. How well an embedding model captures meaning is measured by benchmarks like MTEB, the Massive Text Embedding Benchmark.[1] Retrieval then becomes a nearest-neighbor search: find the top-K chunks closest to the query vector. The Language Model Architecture subject goes deeper on how this representation works.

References

  1. Massive Text Embedding Benchmark (MTEB) — Hugging Face