1# Language Extensions
2
3Language support in Zed has several components:
4
5- Language metadata and configuration
6- Grammar
7- Queries
8- Language servers
9
10## Language Metadata
11
12Each language supported by Zed must be defined in a subdirectory inside the `languages` directory of your extension.
13
14This subdirectory must contain a file called `config.toml` file with the following structure:
15
16```toml
17name = "My Language"
18grammar = "my-language"
19path_suffixes = ["myl"]
20line_comments = ["# "]
21```
22
23- `name` is the human readable name that will show up in the Select Language dropdown.
24- `grammar` is the name of a grammar. Grammars are registered separately, described below.
25- `path_suffixes` (optional) is an array of file suffixes that should be associated with this language. This supports glob patterns like `config/**/*.toml` where `**` matches 0 or more directories and `*` matches 0 or more characters.
26- `line_comments` (optional) is an array of strings that are used to identify line comments in the language.
27
28<!--
29TBD: Document `language_name/config.toml` keys
30
31- line_comments, block_comment
32- autoclose_before
33- brackets (start, end, close, newline, not_in: ["comment", "string"])
34- tab_size, hard_tabs
35- word_characters
36- prettier_parser_name
37- opt_into_language_servers
38- first_line_pattern
39- code_fence_block_name
40- scope_opt_in_language_servers
41- increase_indent_pattern, decrease_indent_pattern
42- collapsed_placeholder
43-->
44
45## Grammar
46
47Zed uses the [Tree-sitter](https://tree-sitter.github.io) parsing library to provide built-in language-specific features. There are grammars available for many languages, and you can also [develop your own grammar](https://tree-sitter.github.io/tree-sitter/creating-parsers#writing-the-grammar). A growing list of Zed features are built using pattern matching over syntax trees with Tree-sitter queries. As mentioned above, every language that is defined in an extension must specify the name of a Tree-sitter grammar that is used for parsing. These grammars are then registered separately in extensions' `extension.toml` file, like this:
48
49```toml
50[grammars.gleam]
51repository = "https://github.com/gleam-lang/tree-sitter-gleam"
52commit = "58b7cac8fc14c92b0677c542610d8738c373fa81"
53```
54
55The `repository` field must specify a repository where the Tree-sitter grammar should be loaded from, and the `commit` field must contain the SHA of the Git commit to use. An extension can provide multiple grammars by referencing multiple tree-sitter repositories.
56
57## Tree-sitter Queries
58
59Zed uses the syntax tree produced by the [Tree-sitter](https://tree-sitter.github.io) query language to implement
60several features:
61
62- Syntax highlighting
63- Bracket matching
64- Code outline/structure
65- Auto-indentation
66- Code injections
67- Syntax overrides
68- Text redactions
69- Runnable code detection
70
71The following sections elaborate on how [Tree-sitter queries](https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax) enable these
72features in Zed, using [JSON syntax](https://www.json.org/json-en.html) as a guiding example.
73
74### Syntax highlighting
75
76In Tree-sitter, the `highlights.scm` file defines syntax highlighting rules for a particular syntax.
77
78Here's an example from a `highlights.scm` for JSON:
79
80```scheme
81(string) @string
82
83(pair
84 key: (string) @property.json_key)
85
86(number) @number
87```
88
89This query marks strings, object keys, and numbers for highlighting. The following is a comprehensive list of captures supported by themes:
90
91| Capture | Description |
92| ------------------------ | -------------------------------------- |
93| @attribute | Captures attributes |
94| @boolean | Captures boolean values |
95| @comment | Captures comments |
96| @comment.doc | Captures documentation comments |
97| @constant | Captures constants |
98| @constructor | Captures constructors |
99| @embedded | Captures embedded content |
100| @emphasis | Captures emphasized text |
101| @emphasis.strong | Captures strongly emphasized text |
102| @enum | Captures enumerations |
103| @function | Captures functions |
104| @hint | Captures hints |
105| @keyword | Captures keywords |
106| @label | Captures labels |
107| @link_text | Captures link text |
108| @link_uri | Captures link URIs |
109| @number | Captures numeric values |
110| @operator | Captures operators |
111| @predictive | Captures predictive text |
112| @preproc | Captures preprocessor directives |
113| @primary | Captures primary elements |
114| @property | Captures properties |
115| @punctuation | Captures punctuation |
116| @punctuation.bracket | Captures brackets |
117| @punctuation.delimiter | Captures delimiters |
118| @punctuation.list_marker | Captures list markers |
119| @punctuation.special | Captures special punctuation |
120| @string | Captures string literals |
121| @string.escape | Captures escaped characters in strings |
122| @string.regex | Captures regular expressions |
123| @string.special | Captures special strings |
124| @string.special.symbol | Captures special symbols |
125| @tag | Captures tags |
126| @text.literal | Captures literal text |
127| @title | Captures titles |
128| @type | Captures types |
129| @variable | Captures variables |
130| @variable.special | Captures special variables |
131| @variant | Captures variants |
132
133### Bracket matching
134
135The `brackets.scm` file defines matching brackets.
136
137Here's an example from a `brackets.scm` file for JSON:
138
139```scheme
140("[" @open "]" @close)
141("{" @open "}" @close)
142("\"" @open "\"" @close)
143```
144
145This query identifies opening and closing brackets, braces, and quotation marks.
146
147| Capture | Description |
148| ------- | --------------------------------------------- |
149| @open | Captures opening brackets, braces, and quotes |
150| @close | Captures closing brackets, braces, and quotes |
151
152### Code outline/structure
153
154The `outline.scm` file defines the structure for the code outline.
155
156Here's an example from an `outline.scm` file for JSON:
157
158```scheme
159(pair
160 key: (string (string_content) @name)) @item
161```
162
163This query captures object keys for the outline structure.
164
165| Capture | Description |
166| -------------- | ------------------------------------------------------------------------------------ |
167| @name | Captures the content of object keys |
168| @item | Captures the entire key-value pair |
169| @context | Captures elements that provide context for the outline item |
170| @context.extra | Captures additional contextual information for the outline item |
171| @annotation | Captures nodes that annotate outline item (doc comments, attributes, decorators)[^1] |
172
173[^1]: These annotations are used by Assistant when generating code modification steps.
174
175### Auto-indentation
176
177The `indents.scm` file defines indentation rules.
178
179Here's an example from an `indents.scm` file for JSON:
180
181```scheme
182(array "]" @end) @indent
183(object "}" @end) @indent
184```
185
186This query marks the end of arrays and objects for indentation purposes.
187
188| Capture | Description |
189| ------- | -------------------------------------------------- |
190| @end | Captures closing brackets and braces |
191| @indent | Captures entire arrays and objects for indentation |
192
193### Code injections
194
195The `injections.scm` file defines rules for embedding one language within another, such as code blocks in Markdown or SQL queries in Python strings.
196
197Here's an example from an `injections.scm` file for Markdown:
198
199```scheme
200(fenced_code_block
201 (info_string
202 (language) @language)
203 (code_fence_content) @content)
204
205((inline) @content
206 (#set! "language" "markdown-inline"))
207```
208
209This query identifies fenced code blocks, capturing the language specified in the info string and the content within the block. It also captures inline content and sets its language to "markdown-inline".
210
211| Capture | Description |
212| --------- | ---------------------------------------------------------- |
213| @language | Captures the language identifier for a code block |
214| @content | Captures the content to be treated as a different language |
215
216Note that we couldn't use JSON as an example here because it doesn't support language injections.
217
218### Syntax overrides
219
220The `overrides.scm` file defines syntax overrides.
221
222Here's an example from an `overrides.scm` file for JSON:
223
224```scheme
225(string) @string
226```
227
228This query explicitly marks strings for highlighting, potentially overriding default behavior. For a complete list of supported captures, refer to the [Syntax highlighting](#syntax-highlighting) section above.
229
230### Text redactions
231
232The `redactions.scm` file defines text redaction rules. When collaborating and sharing your screen, it makes sure that certain syntax nodes are rendered in a redacted mode to avoid them from leaking.
233
234Here's an example from a `redactions.scm` file for JSON:
235
236```scheme
237(pair value: (number) @redact)
238(pair value: (string) @redact)
239(array (number) @redact)
240(array (string) @redact)
241```
242
243This query marks number and string values in key-value pairs and arrays for redaction.
244
245| Capture | Description |
246| ------- | ------------------------------ |
247| @redact | Captures values to be redacted |
248
249### Runnable code detection
250
251The `runnables.scm` file defines rules for detecting runnable code.
252
253Here's an example from an `runnables.scm` file for JSON:
254
255```scheme
256(
257 (document
258 (object
259 (pair
260 key: (string
261 (string_content) @_name
262 (#eq? @_name "scripts")
263 )
264 value: (object
265 (pair
266 key: (string (string_content) @run @script)
267 )
268 )
269 )
270 )
271 )
272 (#set! tag package-script)
273 (#set! tag composer-script)
274)
275```
276
277This query detects runnable scripts in package.json and composer.json files.
278
279The `@run` capture specifies where the run button should appear in the editor. Other captures, except those prefixed with an underscore, are exposed as environment variables with a prefix of `ZED_CUSTOM_$(capture_name)` when running the code.
280
281| Capture | Description |
282| ------- | ------------------------------------------------------ |
283| @\_name | Captures the "scripts" key |
284| @run | Captures the script name |
285| @script | Also captures the script name (for different purposes) |
286
287TBD: `#set! tag`
288
289## Language Servers
290
291Zed uses the [Language Server Protocol](https://microsoft.github.io/language-server-protocol/) to provide advanced language support.
292
293An extension may provide any number of language servers. To provide a language server from your extension, add an entry to your `extension.toml` with the name of your language server and the language it applies to:
294
295```toml
296[language_servers.my-language]
297name = "My Language LSP"
298language = "My Language"
299```
300
301Then, in the Rust code for your extension, implement the `language_server_command` method on your extension:
302
303```rust
304impl zed::Extension for MyExtension {
305 fn language_server_command(
306 &mut self,
307 language_server_id: &LanguageServerId,
308 worktree: &zed::Worktree,
309 ) -> Result<zed::Command> {
310 Ok(zed::Command {
311 command: get_path_to_language_server_executable()?,
312 args: get_args_for_language_server()?,
313 env: get_env_for_language_server()?,
314 })
315 }
316}
317```
318
319You can customize the handling of the language server using several optional methods in the `Extension` trait. For example, you can control how completions are styled using the `label_for_completion` method. For a complete list of methods, see the [API docs for the Zed extension API](https://docs.rs/zed_extension_api).