PhyseaWiki How AI actually works Papers physea.ai →

Embeddings & vectors

How do you pick an embedding model?

Embedding quality varies by task and language, so models are benchmarked. The Massive Text Embedding Benchmark (MTEB) finds that no single method dominates across all tasks, so the right embedding model depends on what you are doing with it.

Last updated 2026-06-15 · Physea Labs

Embeddings differ in quality by task and by language, so they are benchmarked. The Massive Text Embedding Benchmark (MTEB) spans many task types and languages, and its headline finding is a useful warning against shopping for a single winner: “no particular text embedding method dominates across all tasks.”[1] The right embedding model depends on what you are doing with it.

Embedding models & benchmarks

  • OpenAI Embeddings

    Hosted embedding models with a clear guide; recommends cosine similarity on length-1 vectors.

  • Cohere Embed

    Hosted embeddings with a documented semantic-search workflow.

  • MTEB

    Open benchmark for comparing embedding models across tasks and languages.

References

  1. MTEB: Massive Text Embedding Benchmark — Muennighoff et al., EACL 2023