From 5a7b8f7fe30a1468245b00eb1b4f119e5bf011fd Mon Sep 17 00:00:00 2001 From: Thorsten Ball Date: Tue, 7 May 2024 15:46:41 +0200 Subject: [PATCH] linux: Fix restarting by waiting for sockets to be closed (#11488) This fixes a race-condition that showed up when trying to restart Nightly/Preview/... When running with these release channels, Zed tries to ensure that there's only one instance of Zed running. It does that by listening on a TCP socket to which other instances can connect on start. If the other instance receives a message, it knows that another Zed instance is running and exits. On Linux, though, we ran into a race condition: 1. `kill -0`, which checks whether a process is still running, returns an error, signalling that the old Zed process has exited 2. BUT: the process was still listening on the TCP port. It seems like that on Linux, process resources aren't guaranteed to be cleaned up as soon as signal handling stops working for a process. The fix is to wait until the process is no longer listening on any TCP sockets. There's a slight downside to this: GPUI processes that never listen on any TCP sockets now have to pay the cost of an additional `lsof` call when restarting. We do think that it's a reasonable tradeoff for now though, since the other options (extending the platform interface to provide callbacks, sharing the listening port in the framework, ...) seem wider-reaching only to fix a very local bug. Release Notes: - N/A Co-authored-by: Bennet --- crates/gpui/src/platform/linux/platform.rs | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/crates/gpui/src/platform/linux/platform.rs b/crates/gpui/src/platform/linux/platform.rs index 1b5e739928b8d19f96258cc4de4f5cd1ce099f51..842b0d6f53dc893d24435d83fbf5fc67a3b7951d 100644 --- a/crates/gpui/src/platform/linux/platform.rs +++ b/crates/gpui/src/platform/linux/platform.rs @@ -150,12 +150,22 @@ impl Platform for P { } }; - // script to wait for the current process to exit and then restart the app + log::info!("Restarting process, using app path: {:?}", app_path); + + // Script to wait for the current process to exit and then restart the app. + // We also wait for possibly open TCP sockets by the process to be closed, + // since on Linux it's not guaranteed that a process' resources have been + // cleaned up when `kill -0` returns. let script = format!( r#" while kill -O {pid} 2>/dev/null; do sleep 0.1 done + + while lsof -nP -iTCP -a -p {pid} 2>/dev/null; do + sleep 0.1 + done + {app_path} "#, pid = app_pid,