languages.md

  1# Language Extensions
  2
  3Language support in Zed has several components:
  4
  5- Language metadata and configuration
  6- Grammar
  7- Queries
  8- Language servers
  9
 10## Language Metadata
 11
 12Each language supported by Zed must be defined in a subdirectory inside the `languages` directory of your extension.
 13
 14This subdirectory must contain a file called `config.toml` file with the following structure:
 15
 16```toml
 17name = "My Language"
 18grammar = "my-language"
 19path_suffixes = ["myl"]
 20line_comments = ["# "]
 21```
 22
 23- `name` is the human readable name that will show up in the Select Language dropdown.
 24- `grammar` is the name of a grammar. Grammars are registered separately, described below.
 25- `path_suffixes` (optional) is an array of file suffixes that should be associated with this language. This supports glob patterns like `config/**/*.toml` where `**` matches 0 or more directories and `*` matches 0 or more characters.
 26- `line_comments` (optional) is an array of strings that are used to identify line comments in the language.
 27
 28<!--
 29TBD: Document `language_name/config.toml` keys
 30
 31- line_comments, block_comment
 32- autoclose_before
 33- brackets (start, end, close, newline, not_in: ["comment", "string"])
 34- tab_size, hard_tabs
 35- word_characters
 36- prettier_parser_name
 37- opt_into_language_servers
 38- first_line_pattern
 39- code_fence_block_name
 40- scope_opt_in_language_servers
 41- increase_indent_pattern, decrease_indent_pattern
 42- collapsed_placeholder
 43-->
 44
 45## Grammar
 46
 47Zed uses the [Tree-sitter](https://tree-sitter.github.io) parsing library to provide built-in language-specific features. There are grammars available for many languages, and you can also [develop your own grammar](https://tree-sitter.github.io/tree-sitter/creating-parsers#writing-the-grammar). A growing list of Zed features are built using pattern matching over syntax trees with Tree-sitter queries. As mentioned above, every language that is defined in an extension must specify the name of a Tree-sitter grammar that is used for parsing. These grammars are then registered separately in extensions' `extension.toml` file, like this:
 48
 49```toml
 50[grammars.gleam]
 51repository = "https://github.com/gleam-lang/tree-sitter-gleam"
 52commit = "58b7cac8fc14c92b0677c542610d8738c373fa81"
 53```
 54
 55The `repository` field must specify a repository where the Tree-sitter grammar should be loaded from, and the `commit` field must contain the SHA of the Git commit to use. An extension can provide multiple grammars by referencing multiple tree-sitter repositories.
 56
 57## Tree-sitter Queries
 58
 59Zed uses the syntax tree produced by the [Tree-sitter](https://tree-sitter.github.io) query language to implement
 60several features:
 61
 62- Syntax highlighting
 63- Bracket matching
 64- Code outline/structure
 65- Auto-indentation
 66- Code injections
 67- Syntax overrides
 68- Text redactions
 69- Runnable code detection
 70
 71The following sections elaborate on how [Tree-sitter queries](https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax) enable these
 72features in Zed, using [JSON syntax](https://www.json.org/json-en.html) as a guiding example.
 73
 74### Syntax highlighting
 75
 76In Tree-sitter, the `highlights.scm` file defines syntax highlighting rules for a particular syntax.
 77
 78Here's an example from a `highlights.scm` for JSON:
 79
 80```scheme
 81(string) @string
 82
 83(pair
 84  key: (string) @property.json_key)
 85
 86(number) @number
 87```
 88
 89This query marks strings, object keys, and numbers for highlighting. The following is a comprehensive list of captures supported by themes:
 90
 91| Capture                  | Description                            |
 92| ------------------------ | -------------------------------------- |
 93| @attribute               | Captures attributes                    |
 94| @boolean                 | Captures boolean values                |
 95| @comment                 | Captures comments                      |
 96| @comment.doc             | Captures documentation comments        |
 97| @constant                | Captures constants                     |
 98| @constructor             | Captures constructors                  |
 99| @embedded                | Captures embedded content              |
100| @emphasis                | Captures emphasized text               |
101| @emphasis.strong         | Captures strongly emphasized text      |
102| @enum                    | Captures enumerations                  |
103| @function                | Captures functions                     |
104| @hint                    | Captures hints                         |
105| @keyword                 | Captures keywords                      |
106| @label                   | Captures labels                        |
107| @link_text               | Captures link text                     |
108| @link_uri                | Captures link URIs                     |
109| @number                  | Captures numeric values                |
110| @operator                | Captures operators                     |
111| @predictive              | Captures predictive text               |
112| @preproc                 | Captures preprocessor directives       |
113| @primary                 | Captures primary elements              |
114| @property                | Captures properties                    |
115| @punctuation             | Captures punctuation                   |
116| @punctuation.bracket     | Captures brackets                      |
117| @punctuation.delimiter   | Captures delimiters                    |
118| @punctuation.list_marker | Captures list markers                  |
119| @punctuation.special     | Captures special punctuation           |
120| @string                  | Captures string literals               |
121| @string.escape           | Captures escaped characters in strings |
122| @string.regex            | Captures regular expressions           |
123| @string.special          | Captures special strings               |
124| @string.special.symbol   | Captures special symbols               |
125| @tag                     | Captures tags                          |
126| @text.literal            | Captures literal text                  |
127| @title                   | Captures titles                        |
128| @type                    | Captures types                         |
129| @variable                | Captures variables                     |
130| @variable.special        | Captures special variables             |
131| @variant                 | Captures variants                      |
132
133### Bracket matching
134
135The `brackets.scm` file defines matching brackets.
136
137Here's an example from a `brackets.scm` file for JSON:
138
139```scheme
140("[" @open "]" @close)
141("{" @open "}" @close)
142("\"" @open "\"" @close)
143```
144
145This query identifies opening and closing brackets, braces, and quotation marks.
146
147| Capture | Description                                   |
148| ------- | --------------------------------------------- |
149| @open   | Captures opening brackets, braces, and quotes |
150| @close  | Captures closing brackets, braces, and quotes |
151
152### Code outline/structure
153
154The `outline.scm` file defines the structure for the code outline.
155
156Here's an example from an `outline.scm` file for JSON:
157
158```scheme
159(pair
160  key: (string (string_content) @name)) @item
161```
162
163This query captures object keys for the outline structure.
164
165| Capture        | Description                                                                          |
166| -------------- | ------------------------------------------------------------------------------------ |
167| @name          | Captures the content of object keys                                                  |
168| @item          | Captures the entire key-value pair                                                   |
169| @context       | Captures elements that provide context for the outline item                          |
170| @context.extra | Captures additional contextual information for the outline item                      |
171| @annotation    | Captures nodes that annotate outline item (doc comments, attributes, decorators)[^1] |
172
173[^1]: These annotations are used by Assistant when generating code modification steps.
174
175### Auto-indentation
176
177The `indents.scm` file defines indentation rules.
178
179Here's an example from an `indents.scm` file for JSON:
180
181```scheme
182(array "]" @end) @indent
183(object "}" @end) @indent
184```
185
186This query marks the end of arrays and objects for indentation purposes.
187
188| Capture | Description                                        |
189| ------- | -------------------------------------------------- |
190| @end    | Captures closing brackets and braces               |
191| @indent | Captures entire arrays and objects for indentation |
192
193### Code injections
194
195The `injections.scm` file defines rules for embedding one language within another, such as code blocks in Markdown or SQL queries in Python strings.
196
197Here's an example from an `injections.scm` file for Markdown:
198
199```scheme
200(fenced_code_block
201  (info_string
202    (language) @language)
203  (code_fence_content) @content)
204
205((inline) @content
206 (#set! "language" "markdown-inline"))
207```
208
209This query identifies fenced code blocks, capturing the language specified in the info string and the content within the block. It also captures inline content and sets its language to "markdown-inline".
210
211| Capture   | Description                                                |
212| --------- | ---------------------------------------------------------- |
213| @language | Captures the language identifier for a code block          |
214| @content  | Captures the content to be treated as a different language |
215
216Note that we couldn't use JSON as an example here because it doesn't support language injections.
217
218### Syntax overrides
219
220The `overrides.scm` file defines syntax overrides.
221
222Here's an example from an `overrides.scm` file for JSON:
223
224```scheme
225(string) @string
226```
227
228This query explicitly marks strings for highlighting, potentially overriding default behavior. For a complete list of supported captures, refer to the [Syntax highlighting](#syntax-highlighting) section above.
229
230### Text redactions
231
232The `redactions.scm` file defines text redaction rules. When collaborating and sharing your screen, it makes sure that certain syntax nodes are rendered in a redacted mode to avoid them from leaking.
233
234Here's an example from a `redactions.scm` file for JSON:
235
236```scheme
237(pair value: (number) @redact)
238(pair value: (string) @redact)
239(array (number) @redact)
240(array (string) @redact)
241```
242
243This query marks number and string values in key-value pairs and arrays for redaction.
244
245| Capture | Description                    |
246| ------- | ------------------------------ |
247| @redact | Captures values to be redacted |
248
249### Runnable code detection
250
251The `runnables.scm` file defines rules for detecting runnable code.
252
253Here's an example from an `runnables.scm` file for JSON:
254
255```scheme
256(
257    (document
258        (object
259            (pair
260                key: (string
261                    (string_content) @_name
262                    (#eq? @_name "scripts")
263                )
264                value: (object
265                    (pair
266                        key: (string (string_content) @run @script)
267                    )
268                )
269            )
270        )
271    )
272    (#set! tag package-script)
273    (#set! tag composer-script)
274)
275```
276
277This query detects runnable scripts in package.json and composer.json files.
278
279The `@run` capture specifies where the run button should appear in the editor. Other captures, except those prefixed with an underscore, are exposed as environment variables with a prefix of `ZED_CUSTOM_$(capture_name)` when running the code.
280
281| Capture | Description                                            |
282| ------- | ------------------------------------------------------ |
283| @\_name | Captures the "scripts" key                             |
284| @run    | Captures the script name                               |
285| @script | Also captures the script name (for different purposes) |
286
287TBD: `#set! tag`
288
289## Language Servers
290
291Zed uses the [Language Server Protocol](https://microsoft.github.io/language-server-protocol/) to provide advanced language support.
292
293An extension may provide any number of language servers. To provide a language server from your extension, add an entry to your `extension.toml` with the name of your language server and the language it applies to:
294
295```toml
296[language_servers.my-language]
297name = "My Language LSP"
298language = "My Language"
299```
300
301Then, in the Rust code for your extension, implement the `language_server_command` method on your extension:
302
303```rust
304impl zed::Extension for MyExtension {
305    fn language_server_command(
306        &mut self,
307        language_server_id: &LanguageServerId,
308        worktree: &zed::Worktree,
309    ) -> Result<zed::Command> {
310        Ok(zed::Command {
311            command: get_path_to_language_server_executable()?,
312            args: get_args_for_language_server()?,
313            env: get_env_for_language_server()?,
314        })
315    }
316}
317```
318
319You can customize the handling of the language server using several optional methods in the `Extension` trait. For example, you can control how completions are styled using the `label_for_completion` method. For a complete list of methods, see the [API docs for the Zed extension API](https://docs.rs/zed_extension_api).