Add prompts and scripts for automatic crash repro and fix (#49063)

Eric Holk and Ben Brandt created 2 months ago

These prompts can be used to automatically diagnose and fix crashes
report in Sentry.

Usage:
1. Find a crash in Sentry. It will have an ID like ZED-123
2. In an agent, do a prompt like `Follow the instructions in
@investigate.md to investigate ZED-123`
3. Once the agent finds a repro, fix it in a new thread by saying
`Follow the instructions in @fix.md`

Release Notes:
- N/A

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>

Change summary

.factory/prompts/crash/fix.md         |  84 ++++++
.factory/prompts/crash/investigate.md |  89 +++++++
script/sentry-fetch                   | 357 +++++++++++++++++++++++++++++
3 files changed, 530 insertions(+)

Detailed changes

.factory/prompts/crash/fix.md 🔗

@@ -0,0 +1,84 @@
+# Crash Fix
+
+You are fixing a crash that has been analyzed and has a reproduction test case. Your goal is to implement a minimal, correct fix that resolves the root cause and makes the reproduction test pass.
+
+## Inputs
+
+Before starting, you should have:
+
+1. **ANALYSIS.md** — the crash analysis from the investigation phase. Read it thoroughly.
+2. **A failing test** — a reproduction test that triggers the crash. Run it first to confirm it fails as expected.
+
+If either is missing, ask the user to provide them or run the investigation phase first (`/prompt crash/investigate`).
+
+## Workflow
+
+### Step 1: Confirm the Failing Test
+
+Run the reproduction test and verify it fails with the expected crash:
+
+```
+cargo test -p <crate> <test_name>
+```
+
+Read the failure output. Confirm the panic message and stack trace match what ANALYSIS.md describes. If the test doesn't fail, or fails differently than expected, stop and reassess before proceeding.
+
+### Step 2: Understand the Fix
+
+Read the "Suggested Fix" section of ANALYSIS.md and the relevant source code. Before writing any code, be clear on:
+
+1. **What invariant is being violated** — what property of the data does the crashing code assume?
+2. **Where the invariant breaks** — which function produces the bad state?
+
+### Step 3: Implement the Fix
+
+Apply the minimal change needed to resolve the root cause. Guidelines:
+
+- **Fix the root cause, not the symptom.** Don't just catch the panic with a bounds check if the real problem is an incorrect offset calculation. Fix the calculation.
+- **Preserve existing behavior** for all non-crashing cases. The fix should only change what happens in the scenario that was previously crashing.
+- **Don't add unnecessary changes.** No drive-by improvements, keep the diff focused.
+- **Add a comment only if the fix is non-obvious.** If a reader might wonder "why is this check here?", a brief comment explaining the crash scenario is appropriate.
+- **Consider long term maintainability** Please make a targeted fix while being sure to consider the long term maintainability and reliability of the codebase
+
+### Step 4: Verify the Fix
+
+Run the reproduction test and confirm it passes:
+
+```
+cargo test -p <crate> <test_name>
+```
+
+Then run the full test suite for the affected crate to check for regressions:
+
+```
+cargo test -p <crate>
+```
+
+If any tests fail, determine whether the fix introduced a regression. Fix regressions before proceeding.
+
+### Step 5: Run Clippy
+
+```
+./script/clippy
+```
+
+Address any new warnings introduced by your change.
+
+### Step 6: Summarize
+
+Write a brief summary of the fix for use in a PR description. Include:
+
+- **What was the bug** — one sentence on the root cause.
+- **What the fix does** — one sentence on the change.
+- **How it was verified** — note that the reproduction test now passes.
+- **Sentry issue link** — if available from ANALYSIS.md.
+
+We use the following template for pull request descriptions. Please add information to answer the relevant sections, especially for release notes.
+
+```
+<Description of change, what the issue was and the fix.>
+
+Release Notes:
+
+- N/A *or* Added/Fixed/Improved ...
+```

.factory/prompts/crash/investigate.md 🔗

@@ -0,0 +1,89 @@
+# Crash Investigation
+
+You are investigating a crash that was observed in the wild. Your goal is to understand the root cause and produce a minimal reproduction test case that triggers the same crash. This test will be used to verify a fix and prevent regressions.
+
+## Workflow
+
+### Step 1: Get the Crash Report
+
+If given a Sentry issue ID (like `ZED-4VS` or a numeric ID), there are several ways to fetch the crash data:
+
+**Option A: Sentry MCP server (preferred if available)**
+If the Sentry MCP server is configured as a context server, use its tools directly (e.g., `get_sentry_issue`) to fetch the issue details and stack trace. This is the simplest path — no tokens or scripts needed.
+
+**Option B: Fetch script**
+Run the fetch script from the terminal:
+
+```
+script/sentry-fetch <issue-id>
+```
+
+This reads authentication from `~/.sentryclirc` (set up via `sentry-cli login`) or the `SENTRY_AUTH_TOKEN` environment variable.
+
+**Option C: Crash report provided directly**
+If the crash report was provided inline or as a file, read it carefully before proceeding.
+
+### Step 2: Analyze the Stack Trace
+
+Read the stack trace bottom-to-top (from crash site upward) and identify:
+
+1. **The crash site** — the exact function and line where the panic/abort occurs.
+2. **The immediate cause** — what operation failed (e.g., slice indexing on a non-char-boundary, out-of-bounds access, unwrap on None).
+3. **The relevant application frames** — filter out crash handler, signal handler, parking_lot, and stdlib frames. Focus on frames marked "(In app)".
+4. **The data flow** — trace how the invalid data reached the crash site. What computed the bad index, the None value, etc.?
+
+Find the relevant source files in the repository and read them. Pay close attention to:
+- The crashing function and its callers
+- How inputs to the crashing operation are computed
+- Any assumptions the code makes about its inputs (string encoding, array lengths, option values)
+
+### Step 3: Identify the Root Cause
+
+Work backwards from the crash site to determine **what sequence of events or data conditions** produces the invalid state.
+
+Ask yourself: *What user action or sequence of actions could lead to this state?* The crash came from a real user, so there is some natural usage pattern that triggers it.
+
+### Step 4: Write a Reproduction Test
+
+Write a minimal test case that:
+
+1. **Mimics user actions** rather than constructing corrupt state directly. Work from the top down: what does the user do (open a file, type text, trigger a completion, etc.) that eventually causes the internal state to become invalid?
+2. **Exercises the same code path** as the crash. The test should fail in the same function with the same kind of error (e.g., same panic message pattern).
+3. **Is minimal** — include only what's necessary to trigger the crash. Remove anything that isn't load-bearing.
+4. **Lives in the right place** — add the test to the existing test module of the crate where the bug lives. Follow the existing test patterns in that module.
+5. **Avoid overly verbose comments** - the test should be self-explanatory and concise. More detailed descriptions of the test can go in ANALYSIS.md (see the next section).
+
+When the test fails, its stack trace should share the key application frames from the original crash report. The outermost frames (crash handler, signal handling) will differ since we're in a test environment — that's expected.
+
+If you can't reproduce the exact crash but can demonstrate the same class of bug (e.g., same function panicking with a similar invalid input), that is still valuable. Note the difference in your analysis.
+
+### Step 5: Write the Analysis
+
+Create an `ANALYSIS.md` file (in the working directory root, or wherever instructed) with these sections:
+
+```markdown
+# Crash Analysis: <short description>
+
+## Crash Summary
+- **Sentry Issue:** <ID and link if available>
+- **Error:** <the panic/error message>
+- **Crash Site:** <function name and file>
+
+## Root Cause
+<Explain what goes wrong and why. Be specific about the data flow.>
+
+## Reproduction
+<Describe what the test does and how it triggers the same crash.
+Include the exact command to run the test, e.g.:
+`cargo test -p <crate> <test_name>`>
+
+## Suggested Fix
+<Describe the fix approach. Be specific: which function, what check to add,
+what computation to change. If there are multiple options, list them with tradeoffs.>
+```
+
+## Guidelines
+
+- **Don't guess.** If you're unsure about a code path, read the source. Use `grep` to find relevant functions, types, and call sites.
+- **Check the git history.** If the crash appeared in a specific version, `git log` on the relevant files may reveal a recent change that introduced the bug.
+- **Look at existing tests.** The crate likely has tests that show how to set up the relevant subsystem. Follow those patterns rather than inventing new test infrastructure.

script/sentry-fetch 🔗

@@ -0,0 +1,357 @@
+#!/usr/bin/env python3
+"""Fetch a crash report from Sentry and output formatted markdown.
+
+Usage:
+    script/sentry-fetch <issue-short-id-or-numeric-id>
+    script/sentry-fetch ZED-4VS
+    script/sentry-fetch 7243282041
+
+Authentication (checked in order):
+    1. SENTRY_AUTH_TOKEN environment variable
+    2. Token from ~/.sentryclirc (written by `sentry-cli login`)
+
+If neither is found, the script will print setup instructions and exit.
+"""
+
+import argparse
+import configparser
+import json
+import os
+import sys
+import urllib.error
+import urllib.request
+
+SENTRY_BASE_URL = "https://sentry.io/api/0"
+DEFAULT_SENTRY_ORG = "zed-dev"
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Fetch a crash report from Sentry and output formatted markdown."
+    )
+    parser.add_argument(
+        "issue",
+        help="Sentry issue short ID (e.g. ZED-4VS) or numeric issue ID",
+    )
+    args = parser.parse_args()
+
+    token = find_auth_token()
+    if not token:
+        print(
+            "Error: No Sentry auth token found.",
+            file=sys.stderr,
+        )
+        print(
+            "\nSet up authentication using one of these methods:\n"
+            "  1. Run `sentry-cli login` (stores token in ~/.sentryclirc)\n"
+            "  2. Set the SENTRY_AUTH_TOKEN environment variable\n"
+            "\nGet a token at https://sentry.io/settings/auth-tokens/",
+            file=sys.stderr,
+        )
+        sys.exit(1)
+
+    try:
+        issue_id, short_id, issue = resolve_issue(args.issue, token)
+        event = fetch_latest_event(issue_id, token)
+    except FetchError as err:
+        print(f"Error: {err}", file=sys.stderr)
+        sys.exit(1)
+
+    markdown = format_crash_report(issue, event, short_id)
+    print(markdown)
+
+
+class FetchError(Exception):
+    pass
+
+
+def find_auth_token():
+    """Find a Sentry auth token from environment or ~/.sentryclirc.
+
+    Checks in order:
+        1. SENTRY_AUTH_TOKEN environment variable
+        2. auth.token in ~/.sentryclirc (INI format, written by `sentry-cli login`)
+    """
+    token = os.environ.get("SENTRY_AUTH_TOKEN")
+    if token:
+        return token
+
+    sentryclirc_path = os.path.expanduser("~/.sentryclirc")
+    if os.path.isfile(sentryclirc_path):
+        config = configparser.ConfigParser()
+        try:
+            config.read(sentryclirc_path)
+            token = config.get("auth", "token", fallback=None)
+            if token:
+                return token
+        except configparser.Error:
+            pass
+
+    return None
+
+
+def api_get(path, token):
+    """Make an authenticated GET request to the Sentry API."""
+    url = f"{SENTRY_BASE_URL}{path}"
+    req = urllib.request.Request(url)
+    req.add_header("Authorization", f"Bearer {token}")
+    req.add_header("Accept", "application/json")
+    try:
+        with urllib.request.urlopen(req) as response:
+            return json.loads(response.read().decode("utf-8"))
+    except urllib.error.HTTPError as err:
+        body = err.read().decode("utf-8", errors="replace")
+        try:
+            detail = json.loads(body).get("detail", body)
+        except (json.JSONDecodeError, AttributeError):
+            detail = body
+        raise FetchError(f"Sentry API returned HTTP {err.code} for {path}: {detail}")
+    except urllib.error.URLError as err:
+        raise FetchError(f"Failed to connect to Sentry API: {err.reason}")
+
+
+def resolve_issue(identifier, token):
+    """Resolve a Sentry issue by short ID or numeric ID.
+
+    Returns (issue_id, short_id, issue_data).
+    """
+    if identifier.isdigit():
+        issue = api_get(f"/issues/{identifier}/", token)
+        return identifier, issue.get("shortId", identifier), issue
+
+    result = api_get(f"/organizations/{DEFAULT_SENTRY_ORG}/shortids/{identifier}/", token)
+    group_id = str(result["groupId"])
+    issue = api_get(f"/issues/{group_id}/", token)
+    return group_id, identifier, issue
+
+
+def fetch_latest_event(issue_id, token):
+    """Fetch the latest event for an issue."""
+    return api_get(f"/issues/{issue_id}/events/latest/", token)
+
+
+def format_crash_report(issue, event, short_id):
+    """Format a Sentry issue and event as a markdown crash report."""
+    lines = []
+
+    title = issue.get("title", "Unknown Crash")
+    lines.append(f"# {title}")
+    lines.append("")
+
+    issue_id = issue.get("id", "unknown")
+    project = issue.get("project", {})
+    project_slug = (
+        project.get("slug", "unknown") if isinstance(project, dict) else str(project)
+    )
+    first_seen = issue.get("firstSeen", "unknown")
+    last_seen = issue.get("lastSeen", "unknown")
+    count = issue.get("count", "unknown")
+    sentry_url = f"https://sentry.io/organizations/{DEFAULT_SENTRY_ORG}/issues/{issue_id}/"
+
+    lines.append(f"**Short ID:** {short_id}")
+    lines.append(f"**Issue ID:** {issue_id}")
+    lines.append(f"**Project:** {project_slug}")
+    lines.append(f"**Sentry URL:** {sentry_url}")
+    lines.append(f"**First Seen:** {first_seen}")
+    lines.append(f"**Last Seen:** {last_seen}")
+    lines.append(f"**Event Count:** {count}")
+    lines.append("")
+
+    format_tags(lines, event)
+    format_entries(lines, event)
+
+    return "\n".join(lines)
+
+
+def format_tags(lines, event):
+    """Extract and format tags from the event."""
+    tags = event.get("tags", [])
+    if not tags:
+        return
+
+    lines.append("## Tags")
+    lines.append("")
+    for tag in tags:
+        key = tag.get("key", "") if isinstance(tag, dict) else ""
+        value = tag.get("value", "") if isinstance(tag, dict) else ""
+        if key:
+            lines.append(f"- **{key}:** {value}")
+    lines.append("")
+
+
+def format_entries(lines, event):
+    """Format exception and thread entries from the event."""
+    entries = event.get("entries", [])
+
+    for entry in entries:
+        entry_type = entry.get("type", "")
+
+        if entry_type == "exception":
+            format_exceptions(lines, entry)
+        elif entry_type == "threads":
+            format_threads(lines, entry)
+
+
+def format_exceptions(lines, entry):
+    """Format exception entries."""
+    exceptions = entry.get("data", {}).get("values", [])
+    if not exceptions:
+        return
+
+    lines.append("## Exceptions")
+    lines.append("")
+
+    for i, exc in enumerate(exceptions):
+        exc_type = exc.get("type", "Unknown")
+        exc_value = exc.get("value", "")
+        mechanism = exc.get("mechanism", {})
+
+        lines.append(f"### Exception {i + 1}")
+        lines.append(f"**Type:** {exc_type}")
+        if exc_value:
+            lines.append(f"**Value:** {exc_value}")
+        if mechanism:
+            mech_type = mechanism.get("type", "unknown")
+            handled = mechanism.get("handled")
+            if handled is not None:
+                lines.append(f"**Mechanism:** {mech_type} (handled: {handled})")
+            else:
+                lines.append(f"**Mechanism:** {mech_type}")
+        lines.append("")
+
+        stacktrace = exc.get("stacktrace")
+        if stacktrace:
+            frames = stacktrace.get("frames", [])
+            lines.append("#### Stacktrace")
+            lines.append("")
+            lines.append("```")
+            lines.append(format_frames(frames))
+            lines.append("```")
+            lines.append("")
+
+
+def format_threads(lines, entry):
+    """Format thread entries, focusing on crashed and current threads."""
+    threads = entry.get("data", {}).get("values", [])
+    if not threads:
+        return
+
+    crashed_threads = [t for t in threads if t.get("crashed", False)]
+    current_threads = [
+        t for t in threads if t.get("current", False) and not t.get("crashed", False)
+    ]
+    other_threads = [
+        t
+        for t in threads
+        if not t.get("crashed", False) and not t.get("current", False)
+    ]
+
+    lines.append("## Threads")
+    lines.append("")
+
+    for thread in crashed_threads + current_threads:
+        format_single_thread(lines, thread, show_frames=True)
+
+    if other_threads:
+        lines.append(f"*({len(other_threads)} other threads omitted)*")
+        lines.append("")
+
+
+def format_single_thread(lines, thread, show_frames=False):
+    """Format a single thread entry."""
+    thread_id = thread.get("id", "?")
+    thread_name = thread.get("name", "unnamed")
+    crashed = thread.get("crashed", False)
+    current = thread.get("current", False)
+
+    markers = []
+    if crashed:
+        markers.append("CRASHED")
+    if current:
+        markers.append("current")
+    marker_str = f" ({', '.join(markers)})" if markers else ""
+
+    lines.append(f"### Thread {thread_id}: {thread_name}{marker_str}")
+    lines.append("")
+
+    if not show_frames:
+        return
+
+    stacktrace = thread.get("stacktrace")
+    if not stacktrace:
+        return
+
+    frames = stacktrace.get("frames", [])
+    if frames:
+        lines.append("```")
+        lines.append(format_frames(frames))
+        lines.append("```")
+        lines.append("")
+
+
+def format_frames(frames):
+    """Format stack trace frames for display.
+
+    Sentry provides frames from outermost caller to innermost callee,
+    so we reverse them to show the most recent (crashing) call first,
+    matching the convention used in most crash report displays.
+    """
+    output_lines = []
+
+    for frame in reversed(frames):
+        func = frame.get("function") or frame.get("symbol") or "unknown"
+        filename = (
+            frame.get("filename")
+            or frame.get("absPath")
+            or frame.get("abs_path")
+            or "unknown file"
+        )
+        line_no = frame.get("lineNo") or frame.get("lineno")
+        in_app = frame.get("inApp", frame.get("in_app", False))
+
+        app_marker = "(In app)" if in_app else "(Not in app)"
+        line_info = f"Line {line_no}" if line_no else "Line null"
+
+        output_lines.append(f" {func} in {filename} [{line_info}] {app_marker}")
+
+        context_lines = build_context_lines(frame, line_no)
+        output_lines.extend(context_lines)
+
+    return "\n".join(output_lines)
+
+
+def build_context_lines(frame, suspect_line_no):
+    """Build context code lines for a single frame.
+
+    Handles both Sentry response formats:
+    - preContext/contextLine/postContext (separate fields)
+    - context as an array of [line_no, code] tuples
+    """
+    output = []
+
+    pre_context = frame.get("preContext") or frame.get("pre_context") or []
+    context_line = frame.get("contextLine") or frame.get("context_line")
+    post_context = frame.get("postContext") or frame.get("post_context") or []
+
+    if context_line is not None or pre_context or post_context:
+        for code_line in pre_context:
+            output.append(f"    {code_line}")
+        if context_line is not None:
+            output.append(f"    {context_line}  <-- SUSPECT LINE")
+        for code_line in post_context:
+            output.append(f"    {code_line}")
+        return output
+
+    context = frame.get("context") or []
+    for ctx_entry in context:
+        if isinstance(ctx_entry, list) and len(ctx_entry) >= 2:
+            ctx_line_no = ctx_entry[0]
+            ctx_code = ctx_entry[1]
+            suspect = "  <-- SUSPECT LINE" if ctx_line_no == suspect_line_no else ""
+            output.append(f"    {ctx_code}{suspect}")
+
+    return output
+
+
+if __name__ == "__main__":
+    main()