Steve Moyer (smoyer64) opened
As discussed in #803, including a test that calls t.Parallel() and uses a temporary directory causes unpredictable failures on the Windows build server. Rerunning a failed check often succeeds but failures report that many different tests randomly fail. This is a pretty clear sign that there's a concurrency issue. Here's the code that originally caused the problem:
func TestGoGitRepo_Indexes(t *testing.T) {
t.Parallel()
plainRoot, err := ioutil.TempDir("", "")
require.NoError(t, err)
t.Cleanup(func() {
require.NoError(t, os.RemoveAll(plainRoot))
})
repo, err := InitGoGitRepo(plainRoot, namespace)
require.NoError(t, err)
t.Cleanup(func() {
require.NoError(t, repo.Close())
})
// Can create indices
indexA, err := repo.GetBleveIndex("a")
require.NoError(t, err)
require.NotZero(t, indexA)
require.FileExists(t, filepath.Join(plainRoot, ".git", namespace, "indexes", "a", "index_meta.json"))
require.FileExists(t, filepath.Join(plainRoot, ".git", namespace, "indexes", "a", "store"))
indexB, err := repo.GetBleveIndex("b")
require.NoError(t, err)
require.NotZero(t, indexB)
// Can get an existing index
indexA, err = repo.GetBleveIndex("a")
require.NoError(t, err)
require.NotZero(t, indexA)
// Can delete an index
err = repo.ClearBleveIndex("a")
require.NoError(t, err)
require.NoDirExists(t, filepath.Join(plainRoot, ".git", namespace, "indexes", "a"))
}
Linux and MacOS filesystems have a different behavior than Windows - if you delete a file that's in use, it will remove the directory entry but the inodes won't be deleted until all the related file handles have been closed. Changing the code above to the following makes the tests fail less frequently:
func TestGoGitRepo_Indexes(t *testing.T) {
t.Parallel()
plainRoot := t.TempDir()
repo, err := InitGoGitRepo(plainRoot, namespace)
require.NoError(t, err)
t.Cleanup(func() {
require.NoError(t, repo.Close())
})
// Can create indices
indexA, err := repo.GetBleveIndex("a")
require.NoError(t, err)
require.NotZero(t, indexA)
require.FileExists(t, filepath.Join(plainRoot, ".git", namespace, "indexes", "a", "index_meta.json"))
require.FileExists(t, filepath.Join(plainRoot, ".git", namespace, "indexes", "a", "store"))
indexB, err := repo.GetBleveIndex("b")
require.NoError(t, err)
require.NotZero(t, indexB)
// Can get an existing index
indexA, err = repo.GetBleveIndex("a")
require.NoError(t, err)
require.NotZero(t, indexA)
// Can delete an index
err = repo.ClearBleveIndex("a")
require.NoError(t, err)
require.NoDirExists(t, filepath.Join(plainRoot, ".git", namespace, "indexes", "a"))
}
Using t.TempDir() to create each test's temporary directory also registers an alternate cleanup function that has a different behavior from os.RemoveAll(). This function includes the following ToB comment (see https://github.com/golang/go/blob/8a56c7742d96c8ef8e8dcecaf3d1c0e9f022f708/src/testing/testing.go#L1115):
// removeAll is like os.RemoveAll, but retries Windows "Access is denied."
// errors up to an arbitrary timeout.
//
// Those errors have been known to occur spuriously on at least the
// windows-amd64-2012 builder (https://go.dev/issue/50051), and can only occur
// legitimately if the test leaves behind a temp file that either is still open
// or the test otherwise lacks permission to delete. In the case of legitimate
// failures, a failing test may take a bit longer to fail, but once the test is
// fixed the extra latency will go away.
It's possible that universally using t.TempDir() within git-bug will eliminate this issue. It's going to be a bit hard to be sure since the only way to tell is to run the tests repeatedly on Github. There are only three additional occurrences of os.TempDir() in the code as most of the tests rely on CreateGoGitTestRepo for their test repository instance. I've been trying different scenarios in #804 and will make this change next.