Download

Use a Local Model

Use local models when you run the model on your machine or on infrastructure you control.

Local pathZed AI featuresExternal AgentsTerminal ThreadsNotes
OllamaYesSeparate configSeparate configConfigure Ollama for Zed AI features
LM StudioYesSeparate configSeparate configConfigure LM Studio for Zed AI features
Local OpenAI-compatible serverYesSeparate configSeparate configConfigure base URL, model, and key if needed
Local/self-hosted edit predictionEdit Prediction onlyNoNoUses Edit Prediction setup

Ollama

Use Ollama for local models with Zed Agent, Inline Assistant, and similar model-backed Zed AI features.

  1. Download and install Ollama from ollama.com/download.

  2. Pull a model:

    ollama pull mistral
    
  3. Make sure the Ollama server is running. On macOS, open Ollama.app. On Linux or from a shell, run:

    ollama serve
    
  4. In Zed, select an Ollama model from the model dropdown.

Zed automatically discovers models that Ollama has pulled. To disable autodiscovery and list models yourself, configure auto_discover:

{
  "language_models": {
    "ollama": {
      "api_url": "http://localhost:11434",
      "auto_discover": false,
      "available_models": [
        {
          "name": "qwen2.5-coder",
          "display_name": "qwen 2.5 coder",
          "max_tokens": 32768,
          "supports_tools": true,
          "supports_thinking": true,
          "supports_images": true
        }
      ]
    }
  }
}

Ollama Context Length

Zed requests to Ollama include context length as the num_ctx parameter. By default, Zed uses 4096 tokens.

Set a context length for all Ollama models:

{
  "language_models": {
    "ollama": {
      "context_window": 8192
    }
  }
}

You can also configure context length per model with max_tokens in available_models.

If your Ollama server requires a key, enter the key in the provider UI or set OLLAMA_API_KEY. For remote Ollama services such as Ollama Turbo, set the API URL to the remote endpoint and provide an API key.

LM Studio

Use LM Studio for local models with Zed Agent, Inline Assistant, and similar model-backed Zed AI features.

  1. Download and install LM Studio.

  2. Download at least one model in LM Studio, or use the LM Studio CLI:

    lms get qwen2.5-coder-7b
    
  3. Start the LM Studio API server:

    lms server start
    
  4. In Zed, select an LM Studio model from the model dropdown.

If your LM Studio server requires a key, enter the key in the provider UI or set LMSTUDIO_API_KEY.

Local OpenAI-Compatible Servers

Use OpenAI-compatible endpoints for local or self-hosted servers that expose an OpenAI-compatible API.

Local Edit Prediction

Edit Prediction has its own provider setup. See Edit Prediction for local and self-hosted edit prediction options.

Agent Path Boundaries

This page covers local models configured in Zed. External Agents and terminal CLIs may have their own local-model setup; configure those in the agent or CLI.