This Week at Zed Industries: #12

July 21st, 2023

This week has felt momentous for the team. We publicly launched our preview channel and closed a handful of top-ranked feature requests. We're also continuing our exploration of AI within the editor and pushing forward with our efforts to open source Zed through Zed.

Joseph

I did a bit of refinement on the calls dasbboard early this week. Some of the changes included adding some charts to bucket calls by duration length and by call start time.

I've been wanting to publicly release our preview channel for some time now, so I wrote a small blog post and helped to tweak some zed.dev code in order to get this out the door. I'm stoked to have more test coverage on new features before they go out to the stable channel.

Lastly, there were a few smaller tasks I wanted to address this week. One of these was adjusting the in-app feedback submission mechanism to provide users with confirmation that their feedback is being sent, while also preventing any subsequent attempts to resubmit the same feedback. This felt necessary as the frequency of duplicate feedback in our database has been increasing.

Mikayla

Last week, I added a bunch of features, and over the weekend, I decided to also start working on Search and Replace :D. I want Zed to have all the shiny features stabilized before we start live streaming. This week has mostly been about stabilizing and tuning those features and fixing bugs.

Piotr

Last week, I worked on PHP LSP integration and continued my work on lookup tables for tree-sitter. The PR for it is up, and folks over there seemed happy to see this change (or enthusiastic about it, at the very least). For the next week, I plan to go back to Zed and work on our UI a bit more.

Kirill

My work on terminal highlights and navigation with Cmd+hover continued, and I have released the first iteration — I really missed that bit in Zed and am a happy user now. It's amazing to see people providing feedback on the hiccups; hopefully, their experience has improved nonetheless.

One great possibility this basic implementation brings is a path towards external file drag and drop: we now have the logic to handle "try to open this abstract thing," tested and ready to accept filesystem paths too.

For the rest of this week's time, I've mainly fixed bugs. The most notable part was the keybindings and their display in system and Zed menus: now they display user overrides correctly.

Antonio

We're having a great time developing CRDB, which we envision as being the backbone of our platform that will eventually be open-sourced (find more details about this here: https://zed.dev/blog/open-sourcing-zed-on-zed).

Recently, our attention has been focused on exploring a Directed Acyclic Graph (DAG) representation for operations. The use of a DAG to encapsulate operations provides us with a clear understanding of each operation's causal history. This is critical for efficiently tracking concurrency among replicas and branches without version vectors.

Conrad

I am focused on improving Vim emulation. The major feature this week is search and related commands: / and ?, * and #, n and N all now work as they should. Vim search builds on Zed's built-in search to feel snappy and integrated with the rest of the experience. I also read through and sorted all the Vim-related feedback I could find and have prioritized the next set of work – stay tuned!

Kyle

This week we continued to push forward on the semantic index. Notably, Max and I reworked our tree-sitter query engine for parsing symbol objects for embeddings to include options to collapse nested objects when needed. This substantially reduces the token count while maintaining large hierarchical context, speeds up embedding time, and reduces redundant search results. Beyond this, we spent some time refactoring our reindexing process to allow for more nuanced reindexing jobs, and added semantic search as an option in the reworked project search UI.

Julia

Apart from looking at various crashes, one interesting thing I worked on this week was to treat ctrl-click as a right click. A standard feature of macOS apps, we never noticed that we did not support it as everyone on the team with trackpads uses two-finger click. This presents an interesting quandary as Zed is intended to eventually be a cross-platform app; neither Windows nor Linux have a convention for any similar behavior. That means that we probably don't want to handle it in app code as typically it is best to keep platform-specific concerns within the platform layer, allowing the core app code to be platform indifferent. So I went into our macOS platform event code and filtered out left mouse down events with control pressed. Now whenever one gets sent to us by the OS, we replace it with a right mouse down and then right mouse up, allowing the core app code to just see a normal right click.

Max

This week, when I wasn't working with Kyle on semantic search, I worked on an interesting bug that shows up when opening Zed in very large git repositories (like Webkit), where git status takes a very long time to run. In such repositories, slow git status calls are causing our file-system scanning, which is normally run in parallel on every CPU core, to run in a single-threaded way due to some lock contention, and this causes the project files to load very slowly. I'm still in the process of how and when files' git statuses should be retrieved, to avoid this problem in giant repos.

Derek

This week, I worked primarily with Nathan on the new GPUI updates and with Mikayla on some UI updates. I created an initial icon set for the file icons and other various UI icon needs, as well as updated the new search UI for additional features folks are working on. Stay tuned for that UI to land in the app sometime in the near future.

Nathan

I paired a bit more with Derek on improving our approach to UI layout and styling, but spent the bulk of my engineering-related time working on CRDB with Antonio.

One interesting aspect worth sharing was the approach we're taking to networking and messages. Currently, collaboration in Zed relies on Protocol Buffers for all of our network messages, but maintaining an external schema is a bit unwieldy in a number of ways. For CRDB, I was curious if we could find an approach that was pure Rust. So we're trying a crate called serde-bare, which is an implementation of Bare Messages, a compact binary serialization format that allows our message types to be defined fully in Rust.

We have two kinds of messages: One kind will be broadcast via LiveKit data channels to minimize the latency when sharing edits with a collaborator. The other kind are requests which will be sent to our servers for persistence. We wrap each kind of message with a different envelope enum, like this.

// These are sent to our server
#[derive(Clone, Debug, Serialize, Deserialize)]
pub enum RequestEnvelope {
    PublishRepo(PublishRepo),
    CloneRepo(CloneRepo),
    SyncRepo(SyncRepo),
    PublishOperations(PublishOperations),
}
 
// These are sent via LiveKit data channels.
// Currently we only implement one type of message, but this could expand.
#[derive(Clone, Debug, Serialize, Deserialize)]
pub enum MessageEnvelope {
    Operation(Operation),
}

On each of these enums, we implement an unwrap method that returns a Box<dyn Any> with its contents. We can then use dynamic dispatch based on each message's TypeId to route messages to the appropriate handler. To go the other direction, each message implements Into<RequestEnvelope> or Into<MessageEnvelope>. Presumably we can cover these with macros.

Once we got networking ironed out, we implemented synchronization. This is based on vector clocks, which will scale according to the total number of replicas created for a repository. It's too expensive to send these with every operation, but they're very useful for sync, which is essentially a distributed set union between the sets of operations contained by two replicas. When syncing with the server, the client sends a vector clock representing everything it has seen. The server then sends back any operations it has that aren't present on the client, along with its own vector clock representing what the server has. The client then follows up by sending anything the server is missing.

With sync out of the way, we started on applying operations, which is interesting. Every operation is associated with a parent RevisionId, which represents the revision of the repository to which that operation was applied. A RevisionId contains one or more concurrent OperationIds. To apply the operation, we need to retrieve and/or construct a revision matching this id. If we already have a cached revision matching that id, we can simply use it. If we don't, we walk backward through the DAG to find a revision representing the common ancestor of all the operations listed in the revision id for which we have a cached snapshot, then apply operations going forward to reconstruct it.

Initially, we'll maintain a snapshot of every state, so finding a common ancestor should be very cheap. Over time, we'll garbage collect older snapshots, so reconstructing a state further back in time may become more expensive, but that's a coarse-grained operation anyway.

We think we have a working version of the revision reconstruction, and we're currently in the middle of implementing an edit operation based upon it. Fun times traversing and constructing copy-on-write B-trees.

Thanks for reading!