Inline Assistant

Usage Overview

Use ctrl-enter|ctrl-enter to open the Inline Assistant in editors, channel notes, and the terminal panel.

The Inline Assistant sends your current selection (or line) to a language model and replaces it with the response.

Getting Started

If you're using the Inline Assistant for the first time, you need to have at least one LLM provider configured. You can do that by:

using Zed-hosted models, so you have access to models billed through Zed
using API access, either from model providers like Anthropic or model gateways like OpenRouter.

If you have already set up an LLM provider to interact with the Agent Panel, then that will also work for the Inline Assistant.

External Agents run in agent threads, but they are not available for Inline Assistant generations. The Inline Assistant uses Zed-configured LLM providers, including Zed-hosted models, provider API keys, gateways, local models, and supported subscriptions.

Adding Context

You can add context in the Inline Assistant the same way you can in the Agent Panel:

@-mention files, directories, past threads, instruction files, and symbols
paste images that are copied on your clipboard

You can also create a thread in the Agent Panel, then reference it with @thread in the Inline Assistant. This lets you refine a specific change from a larger thread without re-explaining context.

Parallel Generations

The Inline Assistant can generate multiple changes at once:

Multiple Cursors

With multiple cursors, pressing ctrl-enter|ctrl-enter sends the same prompt to each cursor position, generating changes at all locations simultaneously.

This works well with excerpts in multibuffers.

Multiple Models

You can use the Inline Assistant to send the same prompt to multiple models at once.

Here's how you can customize your settings file (how to edit) to add this functionality:

{
  "agent": {
    "default_model": {
      "provider": "zed.dev",
      "model": "claude-sonnet-4-5"
    },
    "inline_alternatives": [
      {
        "provider": "zed.dev",
        "model": "gpt-5-mini"
      }
    ]
  }
}

When multiple models are configured, you'll see in the Inline Assistant UI buttons that allow you to cycle between outputs generated by each model.

The models you specify here are always used in addition to the Inline Assistant's primary model. That's the model set in agent.inline_assistant_model, or your agent.default_model if no Inline Assistant model is configured.

For example, the following configuration will generate three outputs for every assist. One with Claude Sonnet 4.5 (the default model), another with GPT-5-mini, and another one with Gemini 3 Flash.

{
  "agent": {
    "default_model": {
      "provider": "zed.dev",
      "model": "claude-sonnet-4-5"
    },
    "inline_alternatives": [
      {
        "provider": "zed.dev",
        "model": "gpt-5-mini"
      },
      {
        "provider": "zed.dev",
        "model": "gemini-3-flash"
      }
    ]
  }
}

Inline Assistant vs. Edit Prediction

Both features generate inline code, but they work differently:

Inline Assistant: You write a prompt and select what to transform. You control the context.
Edit Prediction: Zed automatically suggests edits based on your recent changes, visited files, and cursor position. No prompting required.

The key difference: Inline Assistant is explicit and prompt-driven; Edit Prediction is automatic and context-inferred.

Prefilling Prompts

To create a custom keybinding that prefills a prompt, you can add the following format in your keymap:

[
  {
    "context": "Editor && mode == full",
    "bindings": {
      "ctrl-shift-enter": [
        "assistant::InlineAssist",
        { "prompt": "Build a snake game" }
      ]
    }
  }
]