PhyseaWiki How AI actually works Papers physea.ai →

Serving & runtimes

What are the easiest ways to run a model on my own machine?

Ollama runs models from the command line, pulling them by name like Docker images. LM Studio is a point-and-click desktop app with a built-in model browser and chat. Both give you a local OpenAI-compatible server.

Last updated 2026-06-15 · Physea Labs

If you want to get a model running today, these two tools ask the least of you.

Ollama is the command-line option. You pull a model by name and run it, much the way you would pull a Docker image. The project’s own tagline lists models you can start with, such as Qwen, Gemma, DeepSeek, and Llama.[1] It is open source under the MIT license, gives you a local REST API, and offers OpenAI-compatible chat completions so other apps can talk to it.[2] Note one current limit: Ollama’s OpenAI-compatible layer covers chat completions, and an embeddings API was listed as still under consideration rather than available.[2]

LM Studio is the point-and-click option. It is a desktop application for running models locally, with a built-in browser for finding and downloading them and a chat window for talking to them.[3] When you are ready to wire it into your own code, it can serve OpenAI-like endpoints on your machine and across your network.[3] It also offers a headless mode for servers or machines where you do not want a GUI.[3]

A useful thing to know: LM Studio runs llama.cpp under the hood for GGUF models, and adds Apple’s MLX engine on Apple Silicon.[3] So the friendly window you click in is sitting on top of the same engine covered on the next page.

Friendly local runtimes

  • Ollama

    Command-line tool that pulls and runs models by name; local REST and OpenAI-compatible API.

  • LM Studio

    Desktop app with a model browser, chat window, and a local OpenAI-compatible server.

References

  1. Ollama README — Ollama
  2. OpenAI compatibility — Ollama
  3. LM Studio Documentation — LM Studio