Add `completion_query_characters` in language (#27175)

Smit Barmase , Max Brunsfeld , and Ben Kunkle created

Closes #18581

Now characters for completing query and word characters, which are
responsible for selecting words by double clicking or navigating, are
different. This fixes a bunch of things:

For settings.json, this improves completions to treat the whole string
as a completion query, instead of just the last word. We now added
"space" as a completion query character without it being a word
character.

For keymap.json, this improves selecting part of an action as the ":"
character is only a completion character and not a word character. So,
completions would still trigger on ":" and query capture will treat ":"
as a word, but for actions like selections and navigation, ":" will be
treated as punctuation.

Before:
Unnecessary related suggestions as query is only the last word which is
"d".
<img width="300" alt="image"
src="https://github.com/user-attachments/assets/8199a715-7521-49dd-948b-e6aaed04c488"
/>

Double clicking `ToggleFold` selects the whole action:
<img width="300" alt="image"
src="https://github.com/user-attachments/assets/c7f91a6b-06d5-45b6-9d59-61a1b2deda71"
/>

After:
Now query is "one d" and it shows only matched ones.
<img width="300" alt="image"
src="https://github.com/user-attachments/assets/1455dfbc-9906-42e8-b8aa-b3f551194ca2"
/>

Double clicking `ToggleFold` only selects part of the action, which is
more refined behavior.
<img width="300" alt="image"
src="https://github.com/user-attachments/assets/34b1c3c2-184f-402f-9dc8-73030a8c370f"
/>

Release Notes:

- Improved autocomplete suggestions in `settings.json`, now whole string
is queried instead of just last word of string, which filters out lot of
false positives.
- Improved selection of action in `keymap.json`, where now you can
double click to only select certain part of action, instead of selecting
whole action.

---------

Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Ben Kunkle <ben@zed.dev>

Change summary

crates/editor/src/editor_tests.rs                 |  2 
crates/editor/src/test/editor_lsp_test_context.rs |  2 
crates/language/src/buffer.rs                     | 18 ++++++++++------
crates/language/src/language.rs                   | 16 +++++++++++++++
crates/languages/src/css/config.toml              |  2 
crates/languages/src/javascript/config.toml       |  2 
crates/languages/src/json/config.toml             |  2 
crates/languages/src/jsonc/config.toml            |  2 
crates/languages/src/markdown/config.toml         |  2 
crates/languages/src/tsx/config.toml              |  2 
docs/src/extensions/languages.md                  |  6 +++-
extensions/html/languages/html/config.toml        |  2 
12 files changed, 40 insertions(+), 18 deletions(-)

Detailed changes

crates/editor/src/editor_tests.rs 🔗

@@ -12871,7 +12871,7 @@ async fn test_completions_in_languages_with_extra_word_characters(cx: &mut TestA
                 overrides: [(
                     "element".into(),
                     LanguageConfigOverride {
-                        word_characters: Override::Set(['-'].into_iter().collect()),
+                        completion_query_characters: Override::Set(['-'].into_iter().collect()),
                         ..Default::default()
                     },
                 )]

crates/editor/src/test/editor_lsp_test_context.rs 🔗

@@ -264,7 +264,7 @@ impl EditorLspTestContext {
                     ..Default::default()
                 },
                 block_comment: Some(("<!-- ".into(), " -->".into())),
-                word_characters: ['-'].into_iter().collect(),
+                completion_query_characters: ['-'].into_iter().collect(),
                 ..Default::default()
             },
             Some(tree_sitter_html::LANGUAGE.into()),

crates/language/src/buffer.rs 🔗

@@ -4727,23 +4727,27 @@ impl CharClassifier {
     }
 
     pub fn kind_with(&self, c: char, ignore_punctuation: bool) -> CharKind {
-        if c.is_whitespace() {
-            return CharKind::Whitespace;
-        } else if c.is_alphanumeric() || c == '_' {
+        if c.is_alphanumeric() || c == '_' {
             return CharKind::Word;
         }
 
         if let Some(scope) = &self.scope {
-            if let Some(characters) = scope.word_characters() {
+            let characters = if self.for_completion {
+                scope.completion_query_characters()
+            } else {
+                scope.word_characters()
+            };
+            if let Some(characters) = characters {
                 if characters.contains(&c) {
-                    if c == '-' && !self.for_completion && !ignore_punctuation {
-                        return CharKind::Punctuation;
-                    }
                     return CharKind::Word;
                 }
             }
         }
 
+        if c.is_whitespace() {
+            return CharKind::Whitespace;
+        }
+
         if ignore_punctuation {
             CharKind::Word
         } else {

crates/language/src/language.rs 🔗

@@ -700,6 +700,9 @@ pub struct LanguageConfig {
     /// If configured, this language contains JSX style tags, and should support auto-closing of those tags.
     #[serde(default)]
     pub jsx_tag_auto_close: Option<JsxTagAutoCloseConfig>,
+    /// A list of characters that Zed should treat as word characters for completion queries.
+    #[serde(default)]
+    pub completion_query_characters: HashSet<char>,
 }
 
 #[derive(Clone, Debug, Serialize, Deserialize, Default, JsonSchema)]
@@ -765,6 +768,8 @@ pub struct LanguageConfigOverride {
     #[serde(default)]
     pub word_characters: Override<HashSet<char>>,
     #[serde(default)]
+    pub completion_query_characters: Override<HashSet<char>>,
+    #[serde(default)]
     pub opt_into_language_servers: Vec<LanguageServerName>,
 }
 
@@ -816,6 +821,7 @@ impl Default for LanguageConfig {
             prettier_parser_name: None,
             hidden: false,
             jsx_tag_auto_close: None,
+            completion_query_characters: Default::default(),
         }
     }
 }
@@ -1705,6 +1711,16 @@ impl LanguageScope {
         )
     }
 
+    /// Returns a list of language-specific characters that are considered part of
+    /// a completion query.
+    pub fn completion_query_characters(&self) -> Option<&HashSet<char>> {
+        Override::as_option(
+            self.config_override()
+                .map(|o| &o.completion_query_characters),
+            Some(&self.language.config.completion_query_characters),
+        )
+    }
+
     /// Returns a list of bracket pairs for a given language with an additional
     /// piece of information about whether the particular bracket pair is currently active for a given language.
     pub fn brackets(&self) -> impl Iterator<Item = (&BracketPair, bool)> {

crates/languages/src/css/config.toml 🔗

@@ -9,6 +9,6 @@ brackets = [
     { start = "\"", end = "\"", close = true, newline = false, not_in = ["string", "comment"] },
     { start = "'", end = "'", close = true, newline = false, not_in = ["string", "comment"] },
 ]
-word_characters = ["-"]
+completion_query_characters = ["-"]
 block_comment = ["/* ", " */"]
 prettier_parser_name = "css"

crates/languages/src/javascript/config.toml 🔗

@@ -32,5 +32,5 @@ block_comment = ["{/* ", " */}"]
 opt_into_language_servers = ["emmet-language-server"]
 
 [overrides.string]
-word_characters = ["-"]
+completion_query_characters = ["-"]
 opt_into_language_servers = ["tailwindcss-language-server"]

crates/languages/src/markdown/config.toml 🔗

@@ -1,7 +1,7 @@
 name = "Markdown"
 grammar = "markdown"
 path_suffixes = ["md", "mdx", "mdwn", "markdown", "MD"]
-word_characters = ["-"]
+completion_query_characters = ["-"]
 block_comment = ["<!-- ", " -->"]
 autoclose_before = "}])>"
 brackets = [

crates/languages/src/tsx/config.toml 🔗

@@ -30,5 +30,5 @@ block_comment = ["{/* ", " */}"]
 opt_into_language_servers = ["emmet-language-server"]
 
 [overrides.string]
-word_characters = ["-"]
+completion_query_characters = ["-"]
 opt_into_language_servers = ["tailwindcss-language-server"]

docs/src/extensions/languages.md 🔗

@@ -223,7 +223,9 @@ Note that we couldn't use JSON as an example here because it doesn't support lan
 
 The `overrides.scm` file defines syntactic _scopes_ that can be used to override certain editor settings within specific language constructs.
 
-For example, there is a language-specific setting called `word_characters` that controls which non-alphabetic characters are considered part of a word, for filtering autocomplete suggestions. In JavaScript, "$" and "#" are considered word characters. But when your cursor is within a _string_ in JavaScript, "-" is _also_ considered a word character. To achieve this, the JavaScript `overrides.scm` file contains the following pattern:
+For example, there is a language-specific setting called `word_characters` that controls which non-alphabetic characters are considered part of a word, for example when you double click to select a variable. In JavaScript, "$" and "#" are considered word characters.
+
+There is also a language-specific setting called `completion_query_characters` that controls which characters trigger autocomplete suggestions. In JavaScript, when your cursor is within a _string_, "-" is should be considered a completion query character. To achieve this, the JavaScript `overrides.scm` file contains the following pattern:
 
 ```scheme
 [
@@ -238,7 +240,7 @@ And the JavaScript `config.toml` contains this setting:
 word_characters = ["#", "$"]
 
 [overrides.string]
-word_characters = ["-"]
+completion_query_characters = ["-"]
 ```
 
 You can also disable certain auto-closing brackets in a specific scope. For example, to prevent auto-closing `'` within strings, you could put the following in the JavaScript `config.toml`:

extensions/html/languages/html/config.toml 🔗

@@ -11,5 +11,5 @@ brackets = [
     { start = "<", end = ">", close = false, newline = true, not_in = ["comment", "string"] },
     { start = "!--", end = " --", close = true, newline = false, not_in = ["comment", "string"] },
 ]
-word_characters = ["-"]
+completion_query_characters = ["-"]
 prettier_parser_name = "html"