Download

Zed-Hosted Models

Zed's plans offer hosted versions of major LLMs with higher rate limits than direct API access. Model availability is updated regularly. To use your own API keys instead, see LLM Providers. For general setup, see AI Quick Start.

Note: Claude Opus models, GPT-5.5 pro, and GPT-5.4 pro are only available on Zed Pro and Zed Business.

ModelProviderToken TypeProvider Price per 1M tokensZed Price per 1M tokens
Claude Opus 4.5AnthropicInput$5.00$5.50
AnthropicOutput$25.00$27.50
AnthropicInput - Cache Write$6.25$6.875
AnthropicInput - Cache Read$0.50$0.55
Claude Opus 4.6AnthropicInput$5.00$5.50
AnthropicOutput$25.00$27.50
AnthropicInput - Cache Write$6.25$6.875
AnthropicInput - Cache Read$0.50$0.55
Claude Opus 4.7AnthropicInput$5.00$5.50
AnthropicOutput$25.00$27.50
AnthropicInput - Cache Write$6.25$6.875
AnthropicInput - Cache Read$0.50$0.55
Claude Opus 4.8AnthropicInput$5.00$5.50
AnthropicOutput$25.00$27.50
AnthropicInput - Cache Write$6.25$6.875
AnthropicInput - Cache Read$0.50$0.55
Claude Sonnet 4.5AnthropicInput$3.00$3.30
AnthropicOutput$15.00$16.50
AnthropicInput - Cache Write$3.75$4.125
AnthropicInput - Cache Read$0.30$0.33
Claude Sonnet 4.6AnthropicInput$3.00$3.30
AnthropicOutput$15.00$16.50
AnthropicInput - Cache Write$3.75$4.125
AnthropicInput - Cache Read$0.30$0.33
Claude Haiku 4.5AnthropicInput$1.00$1.10
AnthropicOutput$5.00$5.50
AnthropicInput - Cache Write$1.25$1.375
AnthropicInput - Cache Read$0.10$0.11
GPT-5.5 proOpenAIInput$30.00$33.00
OpenAIOutput$180.00$198.00
GPT-5.5OpenAIInput$5.00$5.50
OpenAIOutput$30.00$33.00
OpenAICached Input$0.50$0.55
GPT-5.4 proOpenAIInput$30.00$33.00
OpenAIOutput$180.00$198.00
GPT-5.4OpenAIInput$2.50$2.75
OpenAIOutput$15.00$16.50
OpenAICached Input$0.025$0.0275
GPT-5.3-CodexOpenAIInput$1.75$1.925
OpenAIOutput$14.00$15.40
OpenAICached Input$0.175$0.1925
GPT-5.2OpenAIInput$1.75$1.925
OpenAIOutput$14.00$15.40
OpenAICached Input$0.175$0.1925
GPT-5.2-CodexOpenAIInput$1.75$1.925
OpenAIOutput$14.00$15.40
OpenAICached Input$0.175$0.1925
GPT-5 miniOpenAIInput$0.25$0.275
OpenAIOutput$2.00$2.20
OpenAICached Input$0.025$0.0275
GPT-5 nanoOpenAIInput$0.05$0.055
OpenAIOutput$0.40$0.44
OpenAICached Input$0.005$0.0055
Gemini 3.1 ProGoogleInput$2.00$2.20
GoogleOutput$12.00$13.20
Gemini 3.5 FlashGoogleInput$1.50$1.65
GoogleOutput$9.00$9.90
Gemini 3 FlashGoogleInput$0.50$0.55
GoogleOutput$3.00$3.30

Recent Model Retirements

As of February 19, 2026, Zed Pro serves newer model versions in place of the retired models below:

  • Claude Opus 4.1 → Claude Opus 4.5, Claude Opus 4.6, Claude Opus 4.7, or Claude Opus 4.8
  • Claude Sonnet 4 → Claude Sonnet 4.5 or Claude Sonnet 4.6
  • Claude Sonnet 3.7 (retired Feb 19) → Claude Sonnet 4.5 or Claude Sonnet 4.6
  • GPT-5.1 and GPT-5 → GPT-5.2 or GPT-5.2-Codex
  • Gemini 2.5 Pro → Gemini 3.1 Pro
  • Gemini 3 Pro → Gemini 3.1 Pro
  • Gemini 2.5 Flash → Gemini 3 Flash or Gemini 3.5 Flash

Usage

Any usage of a Zed-hosted model will be billed at the Zed Price (rightmost column above). See Plans & Pricing for details on Zed's plans and limits for use of hosted models.

Because Zed-hosted Gemini models do not use Google context caching, Gemini usage is billed only as input and output tokens; there is no separate cached-input price for these models. This preserves zero-data-retention behavior for hosted Gemini requests. For background, see Google's Vertex AI documentation on context caching and zero data retention.

LLMs can enter unproductive loops that require user intervention. Monitor longer-running tasks and interrupt if needed.

Context Windows

A context window is the maximum span of text and code an LLM can consider at once, including both the input prompt and output generated by the model.

ModelProviderZed-Hosted Context Window
Claude Opus 4.5Anthropic200k
Claude Opus 4.6Anthropic1M
Claude Opus 4.7Anthropic1M
Claude Opus 4.8Anthropic1M
Claude Sonnet 4.5Anthropic200k
Claude Sonnet 4.6Anthropic1M
Claude Haiku 4.5Anthropic200k
GPT-5.5 proOpenAI272k input / 400k total
GPT-5.5OpenAI272k input / 400k total
GPT-5.4 proOpenAI272k input / 400k total
GPT-5.4OpenAI272k input / 400k total
GPT-5.3-CodexOpenAI272k input / 400k total
GPT-5.2OpenAI272k input / 400k total
GPT-5.2-CodexOpenAI272k input / 400k total
GPT-5 miniOpenAI272k input / 400k total
GPT-5 nanoOpenAI272k input / 400k total
Gemini 3.1 ProGoogle200k
Gemini 3.5 FlashGoogle1M
Gemini 3 FlashGoogle1M

Zed currently limits hosted Gemini 3.1 Pro requests to 200k tokens because pricing changes above that context size.

Each Agent thread in Zed maintains its own context window. The more prompts, attached files, and responses included in a session, the larger the context window grows.

Start a new thread for each distinct task to keep context focused.

Tool Calls

Models can use tools to interface with your code, search the web, and perform other useful functions.