"git bug rm" broken on master branch

Timeline

Darth Rubik (DarthRubik) opened (edited)

It appears that the "git bug rm" command hangs on my machine when I try to remove a test bug that I created....
I think I have tracked it down to the lock taking on line 502 of "cache/repo_cache_bug.go".
I don't know enough about the code to find the root cause at the moment...... Side question: In order to remove the bug I removed the branch from the "refs/bugs" folder and then deleted the git-bug cache....Should that work?

Michael Muré (MichaelMure) commented

Yes that's enough to do that manually.

What error did you get?

Darth Rubik (DarthRubik) commented

I don't get an error it just hangs forever

Steve Moyer (smoyer64) commented

@MichaelMure - I've confirmed this issue on my local machine and will have a failing test for it tomorrow. If you'd like, you can assign this to me.

Steve Moyer (smoyer64) commented

Running this locally (Debian Bullseye Linux) results in the hanging CLI as noted above and CTRL-C doesn't terminate the hung state. Running this using Go test running and the new testEnv created as part of https://github.com/selesy/git-bug/tree/fix/778-git-bug-rm-broken allows the supervising Go routine to terminate the operation on timeout and return each Go routine's state. Here's the output of the test runner when that timeout occurs:

smoyer1@smoyer1-carbon:~/git/git-bug$ go test -coverprofile=/tmp/junk ./commands -run TestRm
^C^C^C^CBuilding identity cache... Done.
Building bug cache... Done.

coverage: 24.5% of statements
panic: test timed out after 10m0s

goroutine 36 [running]:
testing.(*M).startAlarm.func1()
        /opt/go/1.18.2/src/testing/testing.go:2029 +0x8e
created by time.goFunc
        /opt/go/1.18.2/src/time/sleep.go:176 +0x32

goroutine 1 [chan receive, 10 minutes]:
testing.(*T).Run(0xc00030d1e0, {0x11906b8?, 0x51b005?}, 0x1242530)
        /opt/go/1.18.2/src/testing/testing.go:1487 +0x37a
testing.runTests.func1(0xc00030d1e0?)
        /opt/go/1.18.2/src/testing/testing.go:1839 +0x6e
testing.tRunner(0xc00030d1e0, 0xc0000d7ca0)
        /opt/go/1.18.2/src/testing/testing.go:1439 +0x102
testing.runTests(0xc0003177c0?, {0x1a75980, 0x5, 0x5}, {0x7f6f4d3b8108?, 0x40?, 0x1a84400?})
        /opt/go/1.18.2/src/testing/testing.go:1837 +0x457
testing.(*M).Run(0xc0003177c0)
        /opt/go/1.18.2/src/testing/testing.go:1719 +0x5d9
main.main()
        _testmain.go:187 +0x255

goroutine 6 [select, 10 minutes]:
github.com/blevesearch/bleve/index.AnalysisWorker({0xc0002edc80?, 0xc0002edce0?})
        /home/smoyer1/go/pkg/mod/github.com/blevesearch/bleve@v1.0.14/index/analysis.go:102 +0xc6
created by github.com/blevesearch/bleve/index.NewAnalysisQueue
        /home/smoyer1/go/pkg/mod/github.com/blevesearch/bleve@v1.0.14/index/analysis.go:94 +0x92

goroutine 7 [select, 10 minutes]:
github.com/blevesearch/bleve/index.AnalysisWorker({0xc0002edc80?, 0xc0002edce0?})
        /home/smoyer1/go/pkg/mod/github.com/blevesearch/bleve@v1.0.14/index/analysis.go:102 +0xc6
created by github.com/blevesearch/bleve/index.NewAnalysisQueue
        /home/smoyer1/go/pkg/mod/github.com/blevesearch/bleve@v1.0.14/index/analysis.go:94 +0x92

goroutine 8 [select, 10 minutes]:
github.com/blevesearch/bleve/index.AnalysisWorker({0xc0002edc80?, 0xc0002edce0?})
        /home/smoyer1/go/pkg/mod/github.com/blevesearch/bleve@v1.0.14/index/analysis.go:102 +0xc6
created by github.com/blevesearch/bleve/index.NewAnalysisQueue
        /home/smoyer1/go/pkg/mod/github.com/blevesearch/bleve@v1.0.14/index/analysis.go:94 +0x92

goroutine 9 [select, 10 minutes]:
github.com/blevesearch/bleve/index.AnalysisWorker({0xc0002edc80?, 0xc0002edce0?})
        /home/smoyer1/go/pkg/mod/github.com/blevesearch/bleve@v1.0.14/index/analysis.go:102 +0xc6
created by github.com/blevesearch/bleve/index.NewAnalysisQueue
        /home/smoyer1/go/pkg/mod/github.com/blevesearch/bleve@v1.0.14/index/analysis.go:94 +0x92

goroutine 10 [IO wait, 10 minutes]:
internal/poll.runtime_pollWait(0x7f6f25dfb918, 0x72)
        /opt/go/1.18.2/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc0000da700?, 0xc000110490?, 0x0)
        /opt/go/1.18.2/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
        /opt/go/1.18.2/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).ReadMsg(0xc0000da700, {0xc000110490, 0x10, 0x10}, {0xc000114020, 0x1000, 0x1000}, 0x7f6f24d94118?)
        /opt/go/1.18.2/src/internal/poll/fd_unix.go:304 +0x31c
net.(*netFD).readMsg(0xc0000da700, {0xc000110490?, 0x7f6f24d94118?, 0x0?}, {0xc000114020?, 0x418ce5?, 0x1aaef60?}, 0xc0000c5d08?)
        /opt/go/1.18.2/src/net/fd_posix.go:78 +0x37
net.(*UnixConn).readMsg(0xc000010730, {0xc000110490?, 0xc0000c5d80?, 0x40f345?}, {0xc000114020?, 0xb?, 0x7f6f26a86fff?})
        /opt/go/1.18.2/src/net/unixsock_posix.go:115 +0x4f
net.(*UnixConn).ReadMsgUnix(0xc000010730, {0xc000110490?, 0xc0000c5df8?, 0x40f345?}, {0xc000114020?, 0x0?, 0x0?})
        /opt/go/1.18.2/src/net/unixsock.go:143 +0x3c
github.com/godbus/dbus.(*oobReader).Read(0xc000114000, {0xc000110490?, 0xc0000c5e20?, 0x40f6a7?})
        /home/smoyer1/go/pkg/mod/github.com/godbus/dbus@v0.0.0-20190726142602-4481cbc300e2/transport_unix.go:21 +0x45
io.ReadAtLeast({0x1360960, 0xc000114000}, {0xc000110490, 0x10, 0x10}, 0x10)
        /opt/go/1.18.2/src/io/io.go:331 +0x9a
io.ReadFull(...)
        /opt/go/1.18.2/src/io/io.go:350
github.com/godbus/dbus.(*unixTransport).ReadMessage(0xc0002f3170)
        /home/smoyer1/go/pkg/mod/github.com/godbus/dbus@v0.0.0-20190726142602-4481cbc300e2/transport_unix.go:91 +0x11e
github.com/godbus/dbus.(*Conn).inWorker(0xc0001e27e0)
        /home/smoyer1/go/pkg/mod/github.com/godbus/dbus@v0.0.0-20190726142602-4481cbc300e2/conn.go:294 +0x3b
created by github.com/godbus/dbus.(*Conn).Auth
        /home/smoyer1/go/pkg/mod/github.com/godbus/dbus@v0.0.0-20190726142602-4481cbc300e2/auth.go:118 +0x9ed

goroutine 13 [semacquire, 10 minutes]:
sync.runtime_SemacquireMutex(0x40f6a7?, 0x10?, 0x1130dc0?)
        /opt/go/1.18.2/src/runtime/sema.go:71 +0x25
sync.(*RWMutex).Lock(0x13723d0?)
        /opt/go/1.18.2/src/sync/rwmutex.go:144 +0x71
github.com/MichaelMure/git-bug/cache.(*RepoCache).ResolveBug(0xc00013a3f0, {0xc000432840, 0x40})
        /home/smoyer1/git/git-bug/cache/repo_cache_bug.go:163 +0x18f
github.com/MichaelMure/git-bug/cache.(*RepoCache).ResolveBugMatcher(0x7f6f4d3b85b8?, 0x47591e?)
        /home/smoyer1/git/git-bug/cache/repo_cache_bug.go:236 +0x47
github.com/MichaelMure/git-bug/cache.(*RepoCache).ResolveBugPrefix(0x200?, {0xc000036fc0?, 0xc0004a5050?})
        /home/smoyer1/git/git-bug/cache/repo_cache_bug.go:209 +0x48
github.com/MichaelMure/git-bug/cache.(*RepoCache).RemoveBug(0xc00013a3f0, {0xc000036fc0?, 0x11c4997?})
        /home/smoyer1/git/git-bug/cache/repo_cache_bug.go:504 +0x75
github.com/MichaelMure/git-bug/commands.runRm(0xc0004411c0, {0xc000128550, 0x1361600?, 0xc000010018?})
        /home/smoyer1/git/git-bug/commands/rm.go:34 +0x65
github.com/MichaelMure/git-bug/commands.newRmCommand.func1(0xc0001f9d20?, {0xc000128550?, 0xc000010018?, 0xc0003807e0?})
        /home/smoyer1/git/git-bug/commands/rm.go:18 +0x2c
github.com/MichaelMure/git-bug/commands.closeBackend.func1(0xc00045e280?, {0xc000128550, 0x1, 0x1})
        /home/smoyer1/git/git-bug/commands/env.go:161 +0xd6
github.com/spf13/cobra.(*Command).execute(0xc00045e280, {0xc000128520, 0x1, 0x1})
        /home/smoyer1/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:856 +0x67c
github.com/spf13/cobra.(*Command).ExecuteC(0xc000412280)
        /home/smoyer1/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:974 +0x3b4
github.com/spf13/cobra.(*Command).Execute(...)
        /home/smoyer1/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:902
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        /home/smoyer1/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:895
github.com/MichaelMure/git-bug/commands_test.(*testEnv).Execute(0xc000462360?, 0xc00030d380?)
        /home/smoyer1/git/git-bug/commands/root_test.go:60 +0xa5
github.com/MichaelMure/git-bug/commands_test.TestRm(0xc00030d380)
        /home/smoyer1/git/git-bug/commands/rm_test.go:13 +0xce
testing.tRunner(0xc00030d380, 0x1242530)
        /opt/go/1.18.2/src/testing/testing.go:1439 +0x102
created by testing.(*T).Run
        /opt/go/1.18.2/src/testing/testing.go:1486 +0x35f

goroutine 19 [semacquire, 3 minutes]:
sync.runtime_SemacquireMutex(0x48233b?, 0x6d?, 0x1?)
        /opt/go/1.18.2/src/runtime/sema.go:71 +0x25
sync.(*Mutex).lockSlow(0xc00013a418)
        /opt/go/1.18.2/src/sync/mutex.go:162 +0x165
sync.(*Mutex).Lock(...)
        /opt/go/1.18.2/src/sync/mutex.go:81
sync.(*RWMutex).Lock(0x0?)
        /opt/go/1.18.2/src/sync/rwmutex.go:139 +0x36
github.com/MichaelMure/git-bug/cache.(*RepoCache).Close(0xc00013a3f0)
        /home/smoyer1/git/git-bug/cache/repo_cache.go:152 +0x45
github.com/MichaelMure/git-bug/commands.loadBackend.func1.1.1()
        /home/smoyer1/git/git-bug/commands/env.go:120 +0x45
github.com/MichaelMure/git-bug/util/interrupt.clean()
        /home/smoyer1/git/git-bug/util/interrupt/cleaner.go:76 +0xe5
github.com/MichaelMure/git-bug/util/interrupt.RegisterCleaner.func2()
        /home/smoyer1/git/git-bug/util/interrupt/cleaner.go:57 +0xd1
created by github.com/MichaelMure/git-bug/util/interrupt.RegisterCleaner
        /home/smoyer1/git/git-bug/util/interrupt/cleaner.go:51 +0x245

goroutine 35 [syscall, 3 minutes]:
os/signal.signal_recv()
        /opt/go/1.18.2/src/runtime/sigqueue.go:151 +0x2f
os/signal.loop()
        /opt/go/1.18.2/src/os/signal/signal_unix.go:23 +0x19
created by os/signal.Notify.func1.1
        /opt/go/1.18.2/src/os/signal/signal.go:151 +0x2a
FAIL    github.com/MichaelMure/git-bug/commands600.120s
FAIL
smoyer1@smoyer1-carbon:~/git/git-bug$ 
'''

Hopefully I'll have a solution (and add the appropriate assertions for the `rm <bugID>` test) in the next couple days (weekend and holiday here).

Steve Moyer (smoyer64) commented

@MichaelMure - This appears to be failing because RepoCache.RemoveBug() grabs a read-lock when calling RepoCache.ResolveBugPrefix() but the underlying code called from that method already locks/unlocks the cache as needed. Removing the RLock on repo_cache_bug.go:502 does indeed solve the problem but I'd like a second set of eyes to verify that there is already sufficient locking when resolving the bug prefix.

Michael Muré (MichaelMure) commented

@smoyer64 I had a look, that sounds correct. In addition, other calls to ResolveBugPrefix don't lock that mutex so that's very likely the issue.

Michael Muré (MichaelMure) commented

Also, pretty cool that you are writing test for the commands. It was the intention when coming up with that code structure for commands, but I never ended up actually writing them :-)

Steve Moyer (smoyer64) commented

@MichaelMure - PR submitted!

It was pretty trivial to reproduce this issue manually, but writing tests is a great way to intimately learn the code. There are a couple changes related to making the command package more testable and there will probably be a few more required to cover other parts of the code (e.g. the interactive version of add and user create commands will need to redirect stdin along with something like send/expect helpers.)

Steve Moyer (smoyer64) closed the bug