ai-improvement.md

  1---
  2title: AI Improvement and Data Collection - Zed
  3description: Zed's opt-in approach to AI data collection for improving the agent panel and edit predictions.
  4---
  5
  6# Zed AI Features and Privacy
  7
  8## Overview
  9
 10AI features in Zed include:
 11
 12- [Agent Panel](./agent-panel.md)
 13- [Edit Predictions](./edit-prediction.md)
 14- [Inline Assist](./inline-assistant.md)
 15- [Text Threads](./text-threads.md)
 16- Auto Git Commit Message Generation
 17
 18By default, Zed does not store your prompts or code context. This data is sent to your selected AI provider (e.g., Anthropic, OpenAI, Google, or xAI) to generate responses, then discarded. Zed will not use your data to evaluate or improve AI features unless you explicitly share it (see [AI Feedback with Ratings](#ai-feedback-with-ratings)) or you opt in to edit prediction training data collection (see [Edit Predictions](#edit-predictions)).
 19
 20Zed is model-agnostic by design, and none of this changes based on which provider you choose. You can use your own API keys or Zed's hosted models without any data being retained.
 21
 22### Data Retention and Training
 23
 24Zed's Agent Panel can be used via:
 25
 26- [Zed's hosted models](./subscription.md)
 27- [connecting a non-Zed AI service via API key](./llm-providers.md)
 28- using an [external agent](./external-agents.md) via ACP
 29
 30When using Zed's hosted models, we require assurances from our service providers that your user content won't be used for training models.
 31
 32| Provider  | No Training Guarantee                                   | Zero-Data Retention (ZDR)                                                                                                                     |
 33| --------- | ------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
 34| Anthropic | [Yes](https://www.anthropic.com/legal/commercial-terms) | [Yes](https://privacy.anthropic.com/en/articles/8956058-i-have-a-zero-data-retention-agreement-with-anthropic-what-products-does-it-apply-to) |
 35| Google    | [Yes](https://cloud.google.com/terms/service-terms)     | [Yes](https://cloud.google.com/terms/service-terms), see Service Terms sections 17 and 19h                                                    |
 36| OpenAI    | [Yes](https://openai.com/enterprise-privacy/)           | [Yes](https://platform.openai.com/docs/guides/your-data)                                                                                      |
 37| xAI       | [Yes](https://x.ai/legal/faq-enterprise)                | [Yes](https://x.ai/legal/faq-enterprise)                                                                                                      |
 38
 39When you use your own API keys or external agents, **Zed does not have control over how your data is used by that service provider.**
 40You should reference your agreement with each service provider to understand what terms and conditions apply.
 41
 42### AI Feedback with Ratings
 43
 44You can provide feedback on Zed's AI features by rating specific AI responses in Zed and sharing details related to those conversations with Zed. Each share is opt-in, and sharing once will not cause future content or data to be shared again.
 45
 46> **Rating = Data Sharing:** When you rate a response, your entire conversation thread is sent to Zed. This includes messages, AI responses, and thread metadata.
 47> **_If you don't want data persisted on Zed's servers, don't rate_**. We will not collect data for improving our AI features without you explicitly rating responses.
 48
 49### Data Collected (AI Feedback)
 50
 51For conversations you have explicitly shared with us via rating, Zed may store:
 52
 53- All messages in the thread (your prompts and AI responses)
 54- Any commentary you include with your rating
 55- Thread metadata (model used, token counts, timestamps)
 56- Metadata about your Zed installation
 57
 58If you do not rate responses, Zed will not store Customer Data (code, conversations, responses) related to your usage of the AI features.
 59
 60Telemetry related to Zed's AI features is collected. This includes metadata such as the AI feature being used and high-level interactions with the feature to understand performance (e.g., Agent response time, edit acceptance/rejection in the Agent panel or edit completions). You can read more in Zed's [telemetry](../telemetry.md) documentation.
 61
 62Collected data is stored in Snowflake, a private database. We periodically review this data to refine the agent's system prompt and tool use. All data is anonymized and stripped of sensitive information (access tokens, user IDs, email addresses).
 63
 64## Edit Predictions
 65
 66Edit predictions can be powered by **Zed's Zeta model** or by **third-party providers** like GitHub Copilot.
 67
 68### Zed's Zeta Model (Default)
 69
 70Zed sends a limited context window to the model to generate predictions:
 71
 72- A code excerpt around your cursor (not the full file)
 73- Recent edits as diffs
 74- Relevant excerpts from related open files
 75
 76This data is processed transiently to generate predictions and is not retained afterward.
 77
 78### Third-Party Providers
 79
 80When using third-party providers like GitHub Copilot, **Zed does not control how your data is handled** by that provider. You should consult their Terms and Conditions directly.
 81
 82Note: Zed's `disabled_globs` settings will prevent predictions from being requested, but third-party providers may receive file content when files are opened.
 83
 84### Training Data: Opt-In for Open Source Projects
 85
 86Zed does not collect training data for our edit prediction model unless the following conditions are met:
 87
 881. **You opt in** – Toggle "Training Data Collection" under the **Privacy** section of the edit prediction status bar menu (click the edit prediction icon in the status bar).
 892. **The project is open source** — detected via LICENSE file ([see detection logic](https://github.com/zed-industries/zed/blob/main/crates/edit_prediction/src/license_detection.rs))
 903. **The file isn't excluded** — via `disabled_globs`
 91
 92### File Exclusions
 93
 94Certain files are always excluded from edit predictions—regardless of opt-in status:
 95
 96```json [settings]
 97{
 98  "edit_predictions": {
 99    "disabled_globs": [
100      "**/.env*",
101      "**/*.pem",
102      "**/*.key",
103      "**/*.cert",
104      "**/*.crt",
105      "**/secrets.yml"
106    ]
107  }
108}
109```
110
111Users may explicitly exclude additional paths and/or file extensions by adding them to [`edit_predictions.disabled_globs`](https://zed.dev/docs/reference/all-settings#edit-predictions) in their Zed settings file ([how to edit](../configuring-zed.md#settings-files)):
112
113```json [settings]
114{
115  "edit_predictions": {
116    "disabled_globs": ["secret_dir/*", "**/*.log"]
117  }
118}
119```
120
121### Data Collected (Edit Prediction Training Data)
122
123For open source projects where you've opted in, Zed may collect:
124
125- Code excerpt around your cursor
126- Recent edit diffs
127- The generated prediction
128- Repository URL and git revision
129- Buffer outline and diagnostics
130
131Collected data is stored in Snowflake. We periodically review this data to select training samples for inclusion in our model training dataset. We ensure any included data is anonymized and contains no sensitive information (access tokens, user IDs, email addresses, etc). This training dataset is publicly available at [huggingface.co/datasets/zed-industries/zeta](https://huggingface.co/datasets/zed-industries/zeta).
132
133### Model Output
134
135We then use this training dataset to fine-tune [Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B) and make the resulting model available at [huggingface.co/zed-industries/zeta](https://huggingface.co/zed-industries/zeta).
136
137## Applicable terms
138
139Please see the [Zed Terms of Service](https://zed.dev/terms) for more.