languages.md

  1# Language Extensions
  2
  3Language support in Zed has several components:
  4
  5- Language metadata and configuration
  6- Grammar
  7- Queries
  8- Language servers
  9
 10## Language Metadata
 11
 12Each language supported by Zed must be defined in a subdirectory inside the `languages` directory of your extension.
 13
 14This subdirectory must contain a file called `config.toml` file with the following structure:
 15
 16```toml
 17name = "My Language"
 18grammar = "my-language"
 19path_suffixes = ["myl"]
 20line_comments = ["# "]
 21```
 22
 23- `name` is the human readable name that will show up in the Select Language dropdown.
 24- `grammar` is the name of a grammar. Grammars are registered separately, described below.
 25- `path_suffixes` (optional) is an array of file suffixes that should be associated with this language. This supports glob patterns like `config/**/*.toml` where `**` matches 0 or more directories and `*` matches 0 or more characters.
 26- `line_comments` (optional) is an array of strings that are used to identify line comments in the language.
 27
 28<!--
 29TBD: Document `language_name/config.toml` keys
 30
 31- line_comments, block_comment
 32- autoclose_before
 33- brackets (start, end, close, newline, not_in: ["comment", "string"])
 34- tab_size, hard_tabs
 35- word_characters
 36- prettier_parser_name
 37- opt_into_language_servers
 38- first_line_pattern
 39- code_fence_block_name
 40- scope_opt_in_language_servers
 41- increase_indent_pattern, decrease_indent_pattern
 42- collapsed_placeholder
 43-->
 44
 45## Grammar
 46
 47Zed uses the [Tree-sitter](https://tree-sitter.github.io) parsing library to provide built-in language-specific features. There are grammars available for many languages, and you can also [develop your own grammar](https://tree-sitter.github.io/tree-sitter/creating-parsers#writing-the-grammar). A growing list of Zed features are built using pattern matching over syntax trees with Tree-sitter queries. As mentioned above, every language that is defined in an extension must specify the name of a Tree-sitter grammar that is used for parsing. These grammars are then registered separately in extensions' `extension.toml` file, like this:
 48
 49```toml
 50[grammars.gleam]
 51repository = "https://github.com/gleam-lang/tree-sitter-gleam"
 52commit = "58b7cac8fc14c92b0677c542610d8738c373fa81"
 53```
 54
 55The `repository` field must specify a repository where the Tree-sitter grammar should be loaded from, and the `commit` field must contain the SHA of the Git commit to use. An extension can provide multiple grammars by referencing multiple tree-sitter repositories.
 56
 57## Tree-sitter Queries
 58
 59Zed uses the syntax tree produced by the [Tree-sitter](https://tree-sitter.github.io) query language to implement
 60several features:
 61
 62- Syntax highlighting
 63- Bracket matching
 64- Code outline/structure
 65- Auto-indentation
 66- Code injections
 67- Syntax overrides
 68- Text redactions
 69- Runnable code detection
 70
 71The following sections elaborate on how [Tree-sitter queries](https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax) enable these
 72features in Zed, using [JSON syntax](https://www.json.org/json-en.html) as a guiding example.
 73
 74### Syntax highlighting
 75
 76In Tree-sitter, the `highlights.scm` file defines syntax highlighting rules for a particular syntax.
 77
 78Here's an example from a `highlights.scm` for JSON:
 79
 80```scheme
 81(string) @string
 82
 83(pair
 84  key: (string) @property.json_key)
 85
 86(number) @number
 87```
 88
 89This query marks strings, object keys, and numbers for highlighting. The following is a comprehensive list of captures supported by themes:
 90
 91| Capture                  | Description                            |
 92| ------------------------ | -------------------------------------- |
 93| @attribute               | Captures attributes                    |
 94| @boolean                 | Captures boolean values                |
 95| @comment                 | Captures comments                      |
 96| @comment.doc             | Captures documentation comments        |
 97| @constant                | Captures constants                     |
 98| @constructor             | Captures constructors                  |
 99| @embedded                | Captures embedded content              |
100| @emphasis                | Captures emphasized text               |
101| @emphasis.strong         | Captures strongly emphasized text      |
102| @enum                    | Captures enumerations                  |
103| @function                | Captures functions                     |
104| @hint                    | Captures hints                         |
105| @keyword                 | Captures keywords                      |
106| @label                   | Captures labels                        |
107| @link_text               | Captures link text                     |
108| @link_uri                | Captures link URIs                     |
109| @number                  | Captures numeric values                |
110| @operator                | Captures operators                     |
111| @predictive              | Captures predictive text               |
112| @preproc                 | Captures preprocessor directives       |
113| @primary                 | Captures primary elements              |
114| @property                | Captures properties                    |
115| @punctuation             | Captures punctuation                   |
116| @punctuation.bracket     | Captures brackets                      |
117| @punctuation.delimiter   | Captures delimiters                    |
118| @punctuation.list_marker | Captures list markers                  |
119| @punctuation.special     | Captures special punctuation           |
120| @string                  | Captures string literals               |
121| @string.escape           | Captures escaped characters in strings |
122| @string.regex            | Captures regular expressions           |
123| @string.special          | Captures special strings               |
124| @string.special.symbol   | Captures special symbols               |
125| @tag                     | Captures tags                          |
126| @tag.doctype             | Captures doctypes (e.g., in HTML)      |
127| @text.literal            | Captures literal text                  |
128| @title                   | Captures titles                        |
129| @type                    | Captures types                         |
130| @variable                | Captures variables                     |
131| @variable.special        | Captures special variables             |
132| @variant                 | Captures variants                      |
133
134### Bracket matching
135
136The `brackets.scm` file defines matching brackets.
137
138Here's an example from a `brackets.scm` file for JSON:
139
140```scheme
141("[" @open "]" @close)
142("{" @open "}" @close)
143("\"" @open "\"" @close)
144```
145
146This query identifies opening and closing brackets, braces, and quotation marks.
147
148| Capture | Description                                   |
149| ------- | --------------------------------------------- |
150| @open   | Captures opening brackets, braces, and quotes |
151| @close  | Captures closing brackets, braces, and quotes |
152
153### Code outline/structure
154
155The `outline.scm` file defines the structure for the code outline.
156
157Here's an example from an `outline.scm` file for JSON:
158
159```scheme
160(pair
161  key: (string (string_content) @name)) @item
162```
163
164This query captures object keys for the outline structure.
165
166| Capture        | Description                                                                          |
167| -------------- | ------------------------------------------------------------------------------------ |
168| @name          | Captures the content of object keys                                                  |
169| @item          | Captures the entire key-value pair                                                   |
170| @context       | Captures elements that provide context for the outline item                          |
171| @context.extra | Captures additional contextual information for the outline item                      |
172| @annotation    | Captures nodes that annotate outline item (doc comments, attributes, decorators)[^1] |
173
174[^1]: These annotations are used by Assistant when generating code modification steps.
175
176### Auto-indentation
177
178The `indents.scm` file defines indentation rules.
179
180Here's an example from an `indents.scm` file for JSON:
181
182```scheme
183(array "]" @end) @indent
184(object "}" @end) @indent
185```
186
187This query marks the end of arrays and objects for indentation purposes.
188
189| Capture | Description                                        |
190| ------- | -------------------------------------------------- |
191| @end    | Captures closing brackets and braces               |
192| @indent | Captures entire arrays and objects for indentation |
193
194### Code injections
195
196The `injections.scm` file defines rules for embedding one language within another, such as code blocks in Markdown or SQL queries in Python strings.
197
198Here's an example from an `injections.scm` file for Markdown:
199
200```scheme
201(fenced_code_block
202  (info_string
203    (language) @language)
204  (code_fence_content) @content)
205
206((inline) @content
207 (#set! "language" "markdown-inline"))
208```
209
210This query identifies fenced code blocks, capturing the language specified in the info string and the content within the block. It also captures inline content and sets its language to "markdown-inline".
211
212| Capture   | Description                                                |
213| --------- | ---------------------------------------------------------- |
214| @language | Captures the language identifier for a code block          |
215| @content  | Captures the content to be treated as a different language |
216
217Note that we couldn't use JSON as an example here because it doesn't support language injections.
218
219### Syntax overrides
220
221The `overrides.scm` file defines syntax overrides.
222
223Here's an example from an `overrides.scm` file for JSON:
224
225```scheme
226(string) @string
227```
228
229This query explicitly marks strings for highlighting, potentially overriding default behavior. For a complete list of supported captures, refer to the [Syntax highlighting](#syntax-highlighting) section above.
230
231### Text redactions
232
233The `redactions.scm` file defines text redaction rules. When collaborating and sharing your screen, it makes sure that certain syntax nodes are rendered in a redacted mode to avoid them from leaking.
234
235Here's an example from a `redactions.scm` file for JSON:
236
237```scheme
238(pair value: (number) @redact)
239(pair value: (string) @redact)
240(array (number) @redact)
241(array (string) @redact)
242```
243
244This query marks number and string values in key-value pairs and arrays for redaction.
245
246| Capture | Description                    |
247| ------- | ------------------------------ |
248| @redact | Captures values to be redacted |
249
250### Runnable code detection
251
252The `runnables.scm` file defines rules for detecting runnable code.
253
254Here's an example from an `runnables.scm` file for JSON:
255
256```scheme
257(
258    (document
259        (object
260            (pair
261                key: (string
262                    (string_content) @_name
263                    (#eq? @_name "scripts")
264                )
265                value: (object
266                    (pair
267                        key: (string (string_content) @run @script)
268                    )
269                )
270            )
271        )
272    )
273    (#set! tag package-script)
274    (#set! tag composer-script)
275)
276```
277
278This query detects runnable scripts in package.json and composer.json files.
279
280The `@run` capture specifies where the run button should appear in the editor. Other captures, except those prefixed with an underscore, are exposed as environment variables with a prefix of `ZED_CUSTOM_$(capture_name)` when running the code.
281
282| Capture | Description                                            |
283| ------- | ------------------------------------------------------ |
284| @\_name | Captures the "scripts" key                             |
285| @run    | Captures the script name                               |
286| @script | Also captures the script name (for different purposes) |
287
288<!--
289TBD: `#set! tag`
290-->
291
292## Language Servers
293
294Zed uses the [Language Server Protocol](https://microsoft.github.io/language-server-protocol/) to provide advanced language support.
295
296An extension may provide any number of language servers. To provide a language server from your extension, add an entry to your `extension.toml` with the name of your language server and the language it applies to:
297
298```toml
299[language_servers.my-language]
300name = "My Language LSP"
301language = "My Language"
302```
303
304Then, in the Rust code for your extension, implement the `language_server_command` method on your extension:
305
306```rust
307impl zed::Extension for MyExtension {
308    fn language_server_command(
309        &mut self,
310        language_server_id: &LanguageServerId,
311        worktree: &zed::Worktree,
312    ) -> Result<zed::Command> {
313        Ok(zed::Command {
314            command: get_path_to_language_server_executable()?,
315            args: get_args_for_language_server()?,
316            env: get_env_for_language_server()?,
317        })
318    }
319}
320```
321
322You can customize the handling of the language server using several optional methods in the `Extension` trait. For example, you can control how completions are styled using the `label_for_completion` method. For a complete list of methods, see the [API docs for the Zed extension API](https://docs.rs/zed_extension_api).