PhyseaWiki How AI actually works Papers physea.ai →

Local inference basics

Which one should I install first?

Pick by comfort level. Want a chat window with no setup? Start with a desktop app like LM Studio or Jan. Comfortable in a terminal and want control? Start with Ollama or llama.cpp.

Last updated 2026-06-15 · Physea Labs

You do not need to understand the whole stack to begin. Pick one tool by how you like to work, install it, and let it bundle whatever runtime it needs.

If you want the easiest path, start with a desktop app. LM Studio and Jan both give you a chat window and a built-in place to search for and download models, with nothing to set up by hand.[3, 4] Open the app, browse to a small model that fits your memory, download it, and start chatting. This is the right first step for most people.

If you are comfortable at a command line and want more control, start with a runtime directly. Ollama is the gentler of the two: install it, then run a model with one short command.[1] llama.cpp gives you the most options and the widest hardware support, at the cost of more setup.[2]

One sizing note before you download anything: a model has to fit in your memory to run well, so check the model size calculator and pick a smaller model for your first try. You can always step up later.

Where to get them

  • Ollama

    Runtime and model manager, driven from the terminal

  • llama.cpp

    The low-level inference engine; most control, most setup

  • LM Studio

    Desktop app with chat and a model browser; free for home and work

  • Jan

    Open source desktop app that runs fully offline

References

  1. Ollama — Ollama
  2. llama.cpp — ggml-org (GitHub)
  3. LM Studio — LM Studio
  4. Jan — Jan / Menlo Research