On Programming with Agents

The way we build software at Zed is changing. Over the past several months, more of our team has started working with AI agents as part of our daily workflow. It's moving fast: new models, new capabilities, new patterns emerging week by week. Everyone is learning.

I've been diving deep into this shift, iterating on how to use agents effectively and sharing what works (and what doesn't) with the team. What follows is my framework for thinking about agentic development and how to best utilize these new capabilities. Fundamentally, it all comes down to the nature of the mind behind the agent, the LLM.

What are LLMs?

At their core, LLMs are a function with two inputs, a sequence of tokens and some randomness, and one output: a likely next token. To use an LLM effectively is to constrain the space of possible next tokens until only the correct answer remains. The labs did half the work during training; we do the other half with careful prompting and a powerful agent harness.

But defining "correct" has always been the hard part. It requires domain knowledge and judgment—knowing which tests actually matter, when an abstraction is worth the complexity, whether an API will make sense to the next person who reads it. LLMs can help us write the code but they can't tell us what to build or why.

In short, LLMs automate typing, not thinking.

Three rules for working with agents

At Zed, we sweat the details because every line of code matters. A single bad line can create a memory leak, break security guarantees, or violate an invariant the rest of the codebase depends on. Since agents can handle the typing, we can focus on the craft—and hold ourselves to a higher standard.

But knowing this and doing it are different things. Over the last few months, three rules have emerged that help me stay in control.

1. Only use agents for tasks you already know how to do

When you know the task, the relevant context and success criteria follow. This doesn't mean you need perfect clarity. If you have a rough idea, start by making a plan or asking the agent to help you research. Don't ask it to write code until you've done that fundamental thinking.

In practice: Prompt the agent with your ideas, mention the files that are relevant to its research, and ask it to make a plan. Once you're satisfied, have it write the plan to a PLAN.md file (or equivalent). This file should describe your goal, constraints, the files involved, and anything else you both think is useful. A good plan becomes reliable context you can use in future prompts. Invest your time upfront to get it right.

2. Stay involved in the agent loop

Knowing the task gets you started, but the real world is messier than any plan. Agents work best in tightly constrained environments, and they do a poor job of detecting uncertainty. In my experience, agents tend to push through unclear states, abandon tasks half-finished, or fail to make logical deductions. Stay engaged so you can observe these failure modes and adjust course before they compound.

In practice: Keep tasks small enough to review in one sitting and understand them well enough that you can predict what the implementation should look like. Watch for signs the agent is off-track: unexpected file changes, repetitive attempts at the same fix, or TODO comments where real code should be. When you see these, stop and try to understand why the agent ran aground. Ask the agent why it did something, export the thread to ask another agent about what happened, and look at the code yourself.

3. Review, review, review

Staying involved during the process is essential, but it doesn't replace scrutiny at the end. Every line of agent-generated code needs your sign-off before it ships. Treat the code it writes like a PR from an external contributor: it doesn't know your engineering practices, values, or standards, so its code must be placed under stricter scrutiny than you would a colleague's. Your name is on the code. Make sure you can stand behind it.

In practice: Managing review burden is fundamental to using agents successfully. Lean on outputs you can verify: write the tricky algorithm yourself, rename its function with an LSP, then have the agent generate test cases. If you distribute work effectively and stay involved, your review of the agent's work can be much easier.

The tools have changed, the craft hasn't

Focus on the thinking and let agents do the typing. Use the three rules to guide your workflow: plan first so you know what you're building, stay engaged so you never lose the thread, and review the output so you can stand behind what ships.

Follow this process and agents become a way to raise your standards. You can write thorough tests, build clean abstractions, and make fewer compromises. Skip them and you're just signing up for frustrating debugging sessions later.

The craft is the same as it's always been. We just have more time for it now.