We launched Zeta2, Zed's edit prediction model, in March, and promised more improvements were on the way. Here they are.
Zeta2.1 emits 3x fewer output tokens than Zeta2, bringing predictions up to 50ms faster and requiring 30% fewer servers to serve the same traffic:
| Metric | Zeta2 | Zeta2.1 |
|---|---|---|
| Output tokens (avg) | ~270 | ~90 (−67%) |
| Response Time (p50) | 189ms | 136ms (−28%) |
| Response Time (p90) | 401ms | 350ms (−13%) |
| Acceptance rate | Baseline | +0.51% |
| Explicit rejection rate | Baseline | −4.10% |
These efficiency gains came from a new prompt format we've dubbed "Multi-Region". While Zeta2 output a large region around your cursor with its edits applied, with the new Multi-Region format Zeta2.1 only outputs the region around the code it wants to change. This took several iterations to get right, but the result is even faster predictions on every keystroke.
Zeta2.1 is open-weight, just like Zeta1 and Zeta2. You can see examples of the new prompt format, and download the model on Hugging Face.
As with Zeta2, Zeta2.1 was trained entirely on opt-in data in open-source repositories. If you'd like to help contribute to future improvements, you can opt in by toggling the data collection setting.
Try It
Zeta2.1 is even better for running locally, and works out of the box. Additionally with this release we've begun to publish bindings for the Rust code we use in production to format prompts to PyPI, making it even easier to self host.
Zeta2.1 is the default edit prediction model in Zed today. You can try it out for free, or check out Zed Pro or Zed Business for unlimited edit predictions.
Related Posts
Check out similar blogs from the Zed team.
Looking for a better editor?
You can try Zed today on macOS, Windows, or Linux. Download now!
We are hiring!
If you're passionate about the topics we cover on our blog, please consider joining our team to help us ship the future of software development.