PhyseaWiki How AI actually works Papers physea.ai →

Embeddings & vectors

What is an embedding, and how does it turn meaning into geometry?

An embedding is a list of numbers, a vector, that represents text as a point in a high-dimensional space where the geometry encodes meaning, so similar items land close together. The numbers are learned, not assigned, so each dimension is a latent feature rather than a human-named label.

Last updated 2026-06-15 · Physea Labs

An embedding is a list of numbers, a vector, that represents a piece of text as a point in a high-dimensional space, arranged so that the geometry encodes meaning. OpenAI’s definition is about as direct as it gets: “an embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness.”[1] Items with similar meaning land close together; unrelated items land far apart.

The idea grew out of a simple observation: words used in similar contexts tend to mean similar things, so they should get similar vectors. word2vec showed in 2013 that you could learn such vectors cheaply at scale, and that the resulting space captured real structure, both grammatical and semantic.[2] The numbers are learned rather than assigned, so each dimension is a latent feature, not a label a human picked.

a 2-D sketch of a high-dimensional space cat kitten feline animals car truck vehicle vehicles query: “my pet” nearest neighbors = closest in meaning
An embedding turns text into a point in space, so similar meanings sit close together and a query finds its nearest neighbors.

References

  1. Embeddings guide — OpenAI
  2. Efficient Estimation of Word Representations in Vector Space (word2vec) — Mikolov et al., arXiv