PhyseaWiki How AI actually works Papers physea.ai →

The library

Papers, in plain language

The primary sources this wiki is built on. Each entry is the short version: what the paper introduced and why it matters, with a link to read the original. Every link here is a source the wiki actually cites.

  1. 2013
    Efficient Estimation of Word Representations in Vector Space (word2vec)

    Mikolov, Chen, Corrado, Dean · arXiv · Retrieval & embeddings

    Showed you could learn useful word vectors cheaply at scale, and that the resulting space captures meaning: similar words sit close together, and relationships show up as consistent directions. The seed of modern embeddings.

  2. 2014
    Neural Machine Translation by Jointly Learning to Align and Translate

    Bahdanau, Cho, Bengio · arXiv (ICLR 2015) · Architecture

    Introduced attention. Instead of squeezing a whole sentence into one fixed vector, the model learns to "soft-search" the source for the parts relevant to each output word. This idea became the core of the transformer three years later.

  3. 2017
    Attention Is All You Need

    Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin · arXiv (NeurIPS 2017) · Architecture

    The transformer. It dropped recurrence entirely and built a model on attention alone, so every token can look at every other in parallel. It trained faster and scaled better than anything before it, and it underlies essentially every modern LLM.

  4. 2019
    Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

    Reimers, Gurevych · EMNLP-IJCNLP · Retrieval & embeddings

    Made it practical to embed whole sentences so they can be compared by cosine similarity. Finding the most similar pair in 10,000 sentences dropped from 65 hours to about 5 seconds, which is what made large-scale semantic search feasible.

  5. 2020
    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

    Lewis et al. · NeurIPS · Retrieval & embeddings

    Coined RAG. It paired a generative model with a searchable index of documents, retrieving relevant passages and conditioning the answer on them. The result was more factual and could draw on knowledge outside the model’s training data.

  6. 2022
    MTEB: Massive Text Embedding Benchmark

    Muennighoff, Tazi, Magne, Reimers · arXiv (EACL 2023) · Evaluation

    A broad benchmark for embedding models across many tasks and languages. Its headline finding is a useful caution: no single embedding model wins everywhere, so the right choice depends on what you are doing.

  7. 2023
    Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

    Greshake, Abdelnabi, Mishra, Endres, Holz, Fritz · arXiv (AISec) · Safety

    Formalized indirect prompt injection: hiding instructions in data a model later reads (a web page, a document) so an attacker who never talks to the model can still control it. Demonstrated real attacks against a production system.

  8. 2024
    Building Effective Agents

    Anthropic · Anthropic (essay) · Agents

    The reference guide on the difference between workflows (predefined paths) and agents (the model directs itself), with the practical advice to use the simplest thing that works and only add autonomy when the task needs it.

  9. 2025
    Measuring AI Ability to Complete Long Tasks

    METR · METR (study) · Evaluation

    Measured how long a task an AI can complete reliably, finding models far more dependable on short tasks than long ones, with that time horizon improving over time. A grounded way to think about why agents still need guardrails.

  10. 2025
    Defeating Prompt Injections by Design (CaMeL)

    Debenedetti et al., Google DeepMind · arXiv · Safety

    A by-design defense against prompt injection: split a privileged planner model from a quarantined model that reads untrusted data, and track data provenance so tainted data cannot trigger dangerous actions. Strong results, but not a complete fix.