languages.md

  1---
  2title: Language Extensions
  3description: "Overview of programming language support in Zed, including built-in and extension-based languages."
  4---
  5
  6# Language Extensions
  7
  8Language support in Zed has several components:
  9
 10- Language metadata and configuration
 11- Grammar
 12- Queries
 13- Language servers
 14
 15## Language Metadata
 16
 17Each language supported by Zed must be defined in a subdirectory inside the `languages` directory of your extension.
 18
 19This subdirectory must contain a file called `config.toml` file with the following structure:
 20
 21```toml
 22name = "My Language"
 23grammar = "my-language"
 24path_suffixes = ["myl"]
 25line_comments = ["# "]
 26```
 27
 28- `name` (required) is the human readable name that will show up in the Select Language dropdown.
 29- `grammar` (required) is the name of a grammar. Grammars are registered separately, described below.
 30- `path_suffixes` is an array of file suffixes that should be associated with this language. Unlike `file_types` in settings, this does not support glob patterns.
 31- `line_comments` is an array of strings that are used to identify line comments in the language. This is used for the `editor::ToggleComments` keybind: {#kb editor::ToggleComments} for toggling lines of code.
 32- `tab_size` defines the indentation/tab size used for this language (default is `4`).
 33- `hard_tabs` whether to indent with tabs (`true`) or spaces (`false`, the default).
 34- `first_line_pattern` is a regular expression that can be used alongside `path_suffixes` (above) or `file_types` in settings to match files that should use this language. For example, Zed uses this to identify Shell Scripts by matching [shebang lines](https://github.com/zed-industries/zed/blob/main/crates/languages/src/bash/config.toml) in the first line of a script.
 35- `debuggers` is an array of strings that are used to identify debuggers in the language. When launching a debugger's `New Process Modal`, Zed will order available debuggers by the order of entries in this array.
 36
 37<!--
 38TBD: Document `language_name/config.toml` keys
 39
 40- autoclose_before
 41- brackets (start, end, close, newline, not_in: ["comment", "string"])
 42- word_characters
 43- prettier_parser_name
 44- opt_into_language_servers
 45- code_fence_block_name
 46- scope_opt_in_language_servers
 47- increase_indent_pattern, decrease_indent_pattern
 48- collapsed_placeholder
 49- auto_indent_on_paste, auto_indent_using_last_non_empty_line
 50- overrides: `[overrides.element]`, `[overrides.string]`
 51-->
 52
 53## Grammar
 54
 55Zed uses the [Tree-sitter](https://tree-sitter.github.io) parsing library to provide built-in language-specific features. There are grammars available for many languages, and you can also [develop your own grammar](https://tree-sitter.github.io/tree-sitter/creating-parsers/3-writing-the-grammar.html). A growing list of Zed features are built using pattern matching over syntax trees with Tree-sitter queries. As mentioned above, every language that is defined in an extension must specify the name of a Tree-sitter grammar that is used for parsing. These grammars are then registered separately in extensions' `extension.toml` file, like this:
 56
 57```toml
 58[grammars.gleam]
 59repository = "https://github.com/gleam-lang/tree-sitter-gleam"
 60rev = "58b7cac8fc14c92b0677c542610d8738c373fa81"
 61```
 62
 63The `repository` field must specify a repository where the Tree-sitter grammar should be loaded from, and the `rev` field must contain a Git revision to use, such as the SHA of a Git commit. If you're developing an extension locally and want to load a grammar from the local filesystem, you can use a `file://` URL for `repository`. An extension can provide multiple grammars by referencing multiple tree-sitter repositories.
 64
 65## Tree-sitter Queries
 66
 67Zed uses the syntax tree produced by the [Tree-sitter](https://tree-sitter.github.io) query language to implement
 68several features:
 69
 70- Syntax highlighting
 71- Bracket matching
 72- Code outline/structure
 73- Auto-indentation
 74- Code injections
 75- Syntax overrides
 76- Text redactions
 77- Runnable code detection
 78- Selecting classes, functions, etc.
 79
 80The following sections elaborate on how [Tree-sitter queries](https://tree-sitter.github.io/tree-sitter/using-parsers/queries/index.html) enable these
 81features in Zed, using [JSON syntax](https://www.json.org/json-en.html) as a guiding example.
 82
 83### Syntax highlighting
 84
 85In Tree-sitter, the `highlights.scm` file defines syntax highlighting rules for a particular syntax.
 86
 87Here's an example from a `highlights.scm` for JSON:
 88
 89```scheme
 90(string) @string
 91
 92(pair
 93  key: (string) @property.json_key)
 94
 95(number) @number
 96```
 97
 98This query marks strings, object keys, and numbers for highlighting. The following is the full list of captures supported by themes:
 99
100| Capture                  | Description                            |
101| ------------------------ | -------------------------------------- |
102| @attribute               | Captures attributes                    |
103| @boolean                 | Captures boolean values                |
104| @comment                 | Captures comments                      |
105| @comment.doc             | Captures documentation comments        |
106| @constant                | Captures constants                     |
107| @constant.builtin        | Captures built-in constants            |
108| @constructor             | Captures constructors                  |
109| @embedded                | Captures embedded content              |
110| @emphasis                | Captures emphasized text               |
111| @emphasis.strong         | Captures strongly emphasized text      |
112| @enum                    | Captures enumerations                  |
113| @function                | Captures functions                     |
114| @hint                    | Captures hints                         |
115| @keyword                 | Captures keywords                      |
116| @label                   | Captures labels                        |
117| @link_text               | Captures link text                     |
118| @link_uri                | Captures link URIs                     |
119| @number                  | Captures numeric values                |
120| @operator                | Captures operators                     |
121| @predictive              | Captures predictive text               |
122| @preproc                 | Captures preprocessor directives       |
123| @primary                 | Captures primary elements              |
124| @property                | Captures properties                    |
125| @punctuation             | Captures punctuation                   |
126| @punctuation.bracket     | Captures brackets                      |
127| @punctuation.delimiter   | Captures delimiters                    |
128| @punctuation.list_marker | Captures list markers                  |
129| @punctuation.special     | Captures special punctuation           |
130| @string                  | Captures string literals               |
131| @string.escape           | Captures escaped characters in strings |
132| @string.regex            | Captures regular expressions           |
133| @string.special          | Captures special strings               |
134| @string.special.symbol   | Captures special symbols               |
135| @tag                     | Captures tags                          |
136| @tag.doctype             | Captures doctypes (e.g., in HTML)      |
137| @text.literal            | Captures literal text                  |
138| @title                   | Captures titles                        |
139| @type                    | Captures types                         |
140| @type.builtin            | Captures built-in types                |
141| @variable                | Captures variables                     |
142| @variable.special        | Captures special variables             |
143| @variable.parameter      | Captures function/method parameters    |
144| @variant                 | Captures variants                      |
145
146#### Fallback captures
147
148A single Tree-sitter pattern can specify multiple captures on the same node to define fallback highlights.
149Zed resolves them right-to-left: It first tries the rightmost capture, and if the current theme has no style for it, falls back to the next capture to the left, and so on.
150
151For example:
152
153```scheme
154(type_identifier) @type @variable
155```
156
157Here Zed will first try to resolve `@variable` from the theme. If the theme defines a style for `@variable`, that style is used. Otherwise, Zed falls back to `@type`.
158
159This is useful when a language wants to provide a preferred highlight that not all themes may support, while still falling back to a more common capture that most themes define.
160
161### Bracket matching
162
163The `brackets.scm` file defines matching brackets.
164
165Here's an example from a `brackets.scm` file for JSON:
166
167```scheme
168("[" @open "]" @close)
169("{" @open "}" @close)
170("\"" @open "\"" @close)
171```
172
173This query identifies opening and closing brackets, braces, and quotation marks.
174
175| Capture | Description                                   |
176| ------- | --------------------------------------------- |
177| @open   | Captures opening brackets, braces, and quotes |
178| @close  | Captures closing brackets, braces, and quotes |
179
180Zed uses these to highlight matching brackets: painting each bracket pair with a different color ("rainbow brackets") and highlighting the brackets if the cursor is inside the bracket pair.
181
182To opt out of rainbow brackets colorization, add the following to the corresponding `brackets.scm` entry:
183
184```scheme
185(("\"" @open "\"" @close) (#set! rainbow.exclude))
186```
187
188### Code outline/structure
189
190The `outline.scm` file defines the structure for the code outline.
191
192Here's an example from an `outline.scm` file for JSON:
193
194```scheme
195(pair
196  key: (string (string_content) @name)) @item
197```
198
199This query captures object keys for the outline structure.
200
201| Capture        | Description                                                                          |
202| -------------- | ------------------------------------------------------------------------------------ |
203| @name          | Captures the content of object keys                                                  |
204| @item          | Captures the entire key-value pair                                                   |
205| @context       | Captures elements that provide context for the outline item                          |
206| @context.extra | Captures additional contextual information for the outline item                      |
207| @annotation    | Captures nodes that annotate outline item (doc comments, attributes, decorators)[^1] |
208
209[^1]: These annotations are used by Assistant when generating code modification steps.
210
211### Auto-indentation
212
213The `indents.scm` file defines indentation rules.
214
215Here's an example from an `indents.scm` file for JSON:
216
217```scheme
218(array "]" @end) @indent
219(object "}" @end) @indent
220```
221
222This query marks the end of arrays and objects for indentation purposes.
223
224| Capture | Description                                        |
225| ------- | -------------------------------------------------- |
226| @end    | Captures closing brackets and braces               |
227| @indent | Captures entire arrays and objects for indentation |
228
229### Code injections
230
231The `injections.scm` file defines rules for embedding one language within another, such as code blocks in Markdown or SQL queries in Python strings.
232
233Here's an example from an `injections.scm` file for Markdown:
234
235```scheme
236(fenced_code_block
237  (info_string
238    (language) @injection.language)
239  (code_fence_content) @injection.content)
240
241((inline) @content
242 (#set! injection.language "markdown-inline"))
243```
244
245This query identifies fenced code blocks, capturing the language specified in the info string and the content within the block. It also captures inline content and sets its language to "markdown-inline".
246
247| Capture             | Description                                                |
248| ------------------- | ---------------------------------------------------------- |
249| @injection.language | Captures the language identifier for a code block          |
250| @injection.content  | Captures the content to be treated as a different language |
251
252Note that we couldn't use JSON as an example here because it doesn't support language injections.
253
254### Syntax overrides
255
256The `overrides.scm` file defines syntactic _scopes_ that can be used to override certain editor settings within specific language constructs.
257
258For example, there is a language-specific setting called `word_characters` that controls which non-alphabetic characters are considered part of a word, for example when you double click to select a variable. In JavaScript, "$" and "#" are considered word characters.
259
260There is also a language-specific setting called `completion_query_characters` that controls which characters trigger autocomplete suggestions. In JavaScript, when your cursor is within a _string_, `-` should be considered a completion query character. To achieve this, the JavaScript `overrides.scm` file contains the following pattern:
261
262```scheme
263[
264  (string)
265  (template_string)
266] @string
267```
268
269And the JavaScript `config.toml` contains this setting:
270
271```toml
272word_characters = ["#", "$"]
273
274[overrides.string]
275completion_query_characters = ["-"]
276```
277
278You can also disable certain auto-closing brackets in a specific scope. For example, to prevent auto-closing `'` within strings, you could put the following in the JavaScript `config.toml`:
279
280```toml
281brackets = [
282  { start = "'", end = "'", close = true, newline = false, not_in = ["string"] },
283  # other pairs...
284]
285```
286
287#### Range inclusivity
288
289By default, the ranges defined in `overrides.scm` are _exclusive_. So in the case above, if your cursor was _outside_ the quotation marks delimiting the string, the `string` scope would not take effect. Sometimes, you may want to make the range _inclusive_. You can do this by adding the `.inclusive` suffix to the capture name in the query.
290
291For example, in JavaScript, we also disable auto-closing of single quotes within comments. And the comment scope must extend all the way to the newline after a line comment. To achieve this, the JavaScript `overrides.scm` contains the following pattern:
292
293```scheme
294(comment) @comment.inclusive
295```
296
297### Text objects
298
299The `textobjects.scm` file defines rules for navigating by text objects. This was added in Zed v0.165 and is currently used only in Vim mode.
300
301Vim provides two levels of granularity for navigating around files. Section-by-section with `[]` etc., and method-by-method with `]m` etc. Even languages that don't support functions and classes can work well by defining similar concepts. For example CSS defines a rule-set as a method, and a media-query as a class.
302
303For languages with closures, these typically should not count as functions in Zed. This is best-effort, however, because languages like JavaScript do not syntactically differentiate between closures and top-level function declarations.
304
305For languages with declarations like C, provide queries that match `@class.around` or `@function.around`. The `if` and `ic` text objects will default to these if there is no inside.
306
307If you are not sure what to put in textobjects.scm, both [nvim-treesitter-textobjects](https://github.com/nvim-treesitter/nvim-treesitter-textobjects), and the [Helix editor](https://github.com/helix-editor/helix) have queries for many languages. You can refer to the Zed [built-in languages](https://github.com/zed-industries/zed/tree/main/crates/languages/src) to see how to adapt these.
308
309| Capture          | Description                                                             | Vim mode                                         |
310| ---------------- | ----------------------------------------------------------------------- | ------------------------------------------------ |
311| @function.around | An entire function definition or equivalent small section of a file.    | `[m`, `]m`, `[M`,`]M` motions. `af` text object  |
312| @function.inside | The function body (the stuff within the braces).                        | `if` text object                                 |
313| @class.around    | An entire class definition or equivalent large section of a file.       | `[[`, `]]`, `[]`, `][` motions. `ac` text object |
314| @class.inside    | The contents of a class definition.                                     | `ic` text object                                 |
315| @comment.around  | An entire comment (e.g. all adjacent line comments, or a block comment) | `gc` text object                                 |
316| @comment.inside  | The contents of a comment                                               | `igc` text object (rarely supported)             |
317
318For example:
319
320```scheme
321; include only the content of the method in the function
322(method_definition
323    body: (_
324        "{"
325        (_)* @function.inside
326        "}")) @function.around
327
328; match function.around for declarations with no body
329(function_signature_item) @function.around
330
331; join all adjacent comments into one
332(comment)+ @comment.around
333```
334
335### Text redactions
336
337The `redactions.scm` file defines text redaction rules. When collaborating and sharing your screen, it makes sure that certain syntax nodes are rendered in a redacted mode to avoid them from leaking.
338
339Here's an example from a `redactions.scm` file for JSON:
340
341```scheme
342(pair value: (number) @redact)
343(pair value: (string) @redact)
344(array (number) @redact)
345(array (string) @redact)
346```
347
348This query marks number and string values in key-value pairs and arrays for redaction.
349
350| Capture | Description                    |
351| ------- | ------------------------------ |
352| @redact | Captures values to be redacted |
353
354### Runnable code detection
355
356The `runnables.scm` file defines rules for detecting runnable code.
357
358Here's an example from a `runnables.scm` file for JSON:
359
360```scheme
361(
362    (document
363        (object
364            (pair
365                key: (string
366                    (string_content) @_name
367                    (#eq? @_name "scripts")
368                )
369                value: (object
370                    (pair
371                        key: (string (string_content) @run @script)
372                    )
373                )
374            )
375        )
376    )
377    (#set! tag package-script)
378    (#set! tag composer-script)
379)
380```
381
382This query detects runnable scripts in package.json and composer.json files.
383
384The `@run` capture specifies where the run button should appear in the editor. Other captures, except those prefixed with an underscore, are exposed as environment variables with a prefix of `ZED_CUSTOM_$(capture_name)` when running the code.
385
386| Capture | Description                                            |
387| ------- | ------------------------------------------------------ |
388| @\_name | Captures the "scripts" key                             |
389| @run    | Captures the script name                               |
390| @script | Also captures the script name (for different purposes) |
391
392<!--
393TBD: `#set! tag`
394-->
395
396## Language Servers
397
398Zed uses the [Language Server Protocol](https://microsoft.github.io/language-server-protocol/) to provide advanced language support.
399
400An extension may provide any number of language servers. To provide a language server from your extension, add an entry to your `extension.toml` with the name of your language server and the language(s) it applies to. The entry in the list of `languages` has to match the `name` field from the `config.toml` file for that language:
401
402```toml
403[language_servers.my-language-server]
404name = "My Language LSP"
405languages = ["My Language"]
406```
407
408Then, in the Rust code for your extension, implement the `language_server_command` method on your extension:
409
410```rust
411impl zed::Extension for MyExtension {
412    fn language_server_command(
413        &mut self,
414        language_server_id: &LanguageServerId,
415        worktree: &zed::Worktree,
416    ) -> Result<zed::Command> {
417        Ok(zed::Command {
418            command: get_path_to_language_server_executable()?,
419            args: get_args_for_language_server()?,
420            env: get_env_for_language_server()?,
421        })
422    }
423}
424```
425
426You can customize the handling of the language server using several optional methods in the `Extension` trait. For example, you can control how completions are styled using the `label_for_completion` method. For a complete list of methods, see the [API docs for the Zed extension API](https://docs.rs/zed_extension_api).
427
428### Syntax Highlighting with Semantic Tokens
429
430Zed supports syntax highlighting using semantic tokens from the attached language servers. This is currently disabled by default, but can be enabled in your settings file:
431
432```json [settings]
433{
434  // Enable semantic tokens globally, backed by tree-sitter highlights for each language:
435  "semantic_tokens": "combined",
436  // Or, specify per-language:
437  "languages": {
438    "Rust": {
439      // No tree-sitter, only LSP semantic tokens:
440      "semantic_tokens": "full"
441    }
442  }
443}
444```
445
446The `semantic_tokens` setting accepts the following values:
447
448- `"off"` (default): Do not request semantic tokens from language servers.
449- `"combined"`: Use LSP semantic tokens together with tree-sitter highlighting.
450- `"full"`: Use LSP semantic tokens exclusively, replacing tree-sitter highlighting.
451
452#### Extension-Provided Semantic Token Rules
453
454Language extensions can ship default semantic token rules for their language server's custom token types. To do this, place a `semantic_token_rules.json` file in the language directory alongside `config.toml`:
455
456```
457my-extension/
458  languages/
459    my-language/
460      config.toml
461      highlights.scm
462      semantic_token_rules.json
463```
464
465The file uses the same format as the `semantic_token_rules` array in user settings — a JSON array of rule objects:
466
467```json
468[
469  {
470    "token_type": "lifetime",
471    "style": ["lifetime"]
472  },
473  {
474    "token_type": "builtinType",
475    "style": ["type"]
476  },
477  {
478    "token_type": "selfKeyword",
479    "style": ["variable.special"]
480  }
481]
482```
483
484This is useful when a language server reports custom (non-standard) semantic token types that aren't covered by Zed's built-in default rules. Extension-provided rules act as sensible defaults for that language — users can always override them via `semantic_token_rules` in their settings file, and built-in default rules are only used when neither user nor extension rules match.
485
486#### Customizing Semantic Token Styles
487
488Zed supports customizing the styles used for semantic tokens. You can define rules in your settings file, which customize how semantic tokens get mapped to styles in your theme.
489
490```json [settings]
491{
492  "global_lsp_settings": {
493    "semantic_token_rules": [
494      {
495        // Highlight macros as keywords.
496        "token_type": "macro",
497        "style": ["syntax.keyword"]
498      },
499      {
500        // Highlight unresolved references in bold red.
501        "token_type": "unresolvedReference",
502        "foreground_color": "#c93f3f",
503        "font_weight": "bold"
504      },
505      {
506        // Underline all mutable variables/references/etc.
507        "token_modifiers": ["mutable"],
508        "underline": true
509      }
510    ]
511  }
512}
513```
514
515All rules that match a given `token_type` and `token_modifiers` are applied. Earlier rules take precedence. If no rules match, the token is not highlighted.
516
517Rules are applied in the following priority order (highest to lowest):
518
5191. **User settings** — rules from `semantic_token_rules` in your settings file.
5202. **Extension rules** — rules from `semantic_token_rules.json` in extension language directories.
5213. **Default rules** — Zed's built-in rules for standard LSP token types.
522
523Each rule in the `semantic_token_rules` array is defined as follows:
524
525- `token_type`: The semantic token type as defined by the [LSP specification](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocument_semanticTokens). If omitted, the rule matches all token types.
526- `token_modifiers`: A list of semantic token modifiers to match. All modifiers must be present to match.
527- `style`: A list of styles from the current syntax theme to use. The first style found is used. Any settings below override that style.
528- `foreground_color`: The foreground color to use for the token type, in hex format (e.g., `"#ff0000"`).
529- `background_color`: The background color to use for the token type, in hex format (e.g., `"#ff0000"`).
530- `underline`: A boolean or color to underline with, in hex format. If `true`, then the token will be underlined with the text color.
531- `strikethrough`: A boolean or color to strikethrough with, in hex format. If `true`, then the token have a strikethrough with the text color.
532- `font_weight`: One of `"normal"`, `"bold"`.
533- `font_style`: One of `"normal"`, `"italic"`.
534
535### Multi-Language Support
536
537If your language server supports additional languages, you can use `language_ids` to map Zed `languages` to the desired [LSP-specific `languageId`](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocumentItem) identifiers:
538
539```toml
540
541[language-servers.my-language-server]
542name = "Whatever LSP"
543languages = ["JavaScript", "HTML", "CSS"]
544
545[language-servers.my-language-server.language_ids]
546"JavaScript" = "javascript"
547"TSX" = "typescriptreact"
548"HTML" = "html"
549"CSS" = "css"
550```