Finding which commit introduced a bug

Wile E. is a software engineer for ACME Corporation.

Before leaving for the holidays, he released the latest version of ACME's most recent video game: Pangolin, a game in which the PC tries to guess an animal you just thought of.

The source code for the game is here.

After returning from vacation, he checks out the code and tries to run the tests:

$ git clone https://github.com/ziccardi/pangolin 
$ cd pangolin
$ git checkout BISECT_EXERCISE
$ go test ./...

?       pangolini       [no test files]
--- FAIL: TestMemoryTree (0.00s)
    --- FAIL: TestMemoryTree/Find_whale (0.00s)
        memory_cell_test.go:100: 
                Error Trace:    /Users/ziccardi/work/ziccardi/blog/pangolin/memory/memory_cell_test.go:100
                Error:          Not equal: 
                                expected: "a whale"
                                actual  : "Is it scaly?"

                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1 +1 @@
                                -a whale
                                +Is it scaly?
                Test:           TestMemoryTree/Find_whale
    --- FAIL: TestMemoryTree/Find_pangolin (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x110c800]

Wile: "Oh my! It was months since I last ran the tests! I don't even know when the bug was first introduced! I need to find the offending commit!"

Road Runner: "Hey, Wile! Welcome back! What are you working on?"

Wile: "I just found out that I probably made some code mistakes some time ago. I'm testing each commit to find out when it happened."

Road Runner: "Oh... good luck!"

A few seconds later, while Wile is still working on the second commit, Road Runner returns.

Road Runner: "The offending commit is the 3e7b5549b4e9a one"

Wile: "How did you find out so quickly?"

Road Runner: "Because, as usual, you are walking in the jungle while there is a highway next to you. I'll show you."

$ git bisect start \
   && git bisect good 4f015ddf8a003f7d48ab316d76bc9ae36c9a619e \
   && git bisect bad HEAD \
   && git bisect run go test ./... && git bisect reset

... cut ...
3e7b5549b4e9a2b68f825252cc31afdc1bff6f3c is the first bad commit

Road Runner: "See? Very easy!"

Wile is speechless.

Wile: "Road, please explain this magic to me. Now!"

The Bisect command

The git bisect command allows you to search for a commit using a binary search algorithm, and therefore needs a start (the last known to be a good commit) and an end (a bad commit). That is exactly what we do with the following commands:

  • git bisect start: starts a bisect process
  • git bisect good {REV}: set the good commit. In the example, we specify the hash of the commit
  • git bisect bad {REV}: set the bad commit. In the example, we are specifying the current HEAD

Now that we know how to initialize a bisect session, we can start looking for the first wrong commit (the one that introduced the bug).
There are two ways to look for the wrong commit: manually or automatically (as I did before): the latter is much faster but needs a script that returns 0 or 1 depending on whether it is successful.

The manual process

First, let's look at the list of commits:

$ git log --pretty=oneline

e33e9788b0e9a0235d0f9cd0f48cb2a7ff9b6613 (HEAD -> BISECT_EXERCISE, origin/BISECT_EXERCISE) split files. added cell interface
9a948544601bec109c9b9bf717ac7e7ec6a61104 renamed memory objects. Removed some code from main
17a722bf2bb669819baa03cf54922a3e6b1038a3 made memory file configurable
dd5dd5c09a9e841e7ef6cd245d121ff3593b8318 Added readme
312faf10cc500cfb6e7207670bbe4455a2e2cb73 added persistence
d00eebe8259d0250bf8fe5791b326d3816b06396 Added readline utility function
3e7b5549b4e9a2b68f825252cc31afdc1bff6f3c Moved memory object to its own file
cc175fe66e2416d74bb8386d24544b98622371c5 Added functinal options pattern
4f015ddf8a003f7d48ab316d76bc9ae36c9a619e first commit

We know that the code definitely worked when the first commit was done (4f015ddf8a003f7d48ab316d76bc9ae36c9a619e) and we know that we definitely have a problem in the current HEAD (e33e9788b0e9a0235d0f9cd0f48cb2a7ff9b6613):

$ git bisect start
$ git bisect bad HEAD
$ git bisect good 4f015ddf8a003f7d48ab316d76bc9ae36c9a619e
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[312faf10cc500cfb6e7207670bbe4455a2e2cb73] added persistence

As you can see, bisect has moved us to commit 312faf10: in the middle. Let's check if it is a good or bad commit:

$ go test ./...
?       pangolini       [no test files]
--- FAIL: TestMemoryTree (0.00s)
    --- FAIL: TestMemoryTree/Find_whale (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x119a89b]

goroutine 36 [running]:
testing.tRunner.func1.2({0x11ba2c0, 0x1360900})
        /usr/local/opt/go/libexec/src/testing/testing.go:1396 +0x24e
testing.tRunner.func1()
        /usr/local/opt/go/libexec/src/testing/testing.go:1399 +0x39f
panic({0x11ba2c0, 0x1360900})
        /usr/local/opt/go/libexec/src/runtime/panic.go:884 +0x212
pangolini/memory_test.TestMemoryTree.func1(0x0?)
        /Users/ziccardi/work/ziccardi/blog/pangolin/memory/memory_cell_test.go:103 +0x11b
testing.tRunner(0xc000105380, 0xc000111030)
        /usr/local/opt/go/libexec/src/testing/testing.go:1446 +0x10b
created by testing.(*T).Run
        /usr/local/opt/go/libexec/src/testing/testing.go:1493 +0x35f
FAIL    pangolini/memory        0.131s
FAIL

It has failed. This commit is bad.

$ git bisect bad
Bisecting: 1 revision left to test after this (roughly 1 step)
[3e7b5549b4e9a2b68f825252cc31afdc1bff6f3c] Moved memory object to its own file

Bisect moved us again in the middle between the last known good commit and the bad commit we just discovered. Let's check if this is a good or a bad one.

$ go test ./...                                                                                                                                           
?       pangolini       [no test files]
--- FAIL: TestMemoryTree (0.00s)
    --- FAIL: TestMemoryTree/Find_whale (0.00s)

This commit is bad too.

$ git bisect bad
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[cc175fe66e2416d74bb8386d24544b98622371c5] Added functinal options pattern

Checking the code this time we get

$ go test ./...
?       pangolini       [no test files]

There were no tests in this commit. We report this as good.

$ git bisect good
3e7b5549b4e9a2b68f825252cc31afdc1bff6f3c is the first bad commit
commit 3e7b5549b4e9a2b68f825252cc31afdc1bff6f3c

We finally found it! If we check the contents of the 3e7b5549 commit, we will see:

$ git diff 3e7b5549~ 3e7b5549
--- 8x cut ---
+func WithNo(branch *MemoryCell) MemoryCellOption {
+       return func(cell *MemoryCell) {
+               cell.Yes = branch
+       }
+}

That's the issue! The WithNo functional Option is setting cell.Yes instead of cell.No.

We can finally end the bisect session with:

$ git bisect reset
Previous HEAD position was cc175fe Added functinal options pattern
Switched to branch 'BISECT_EXERCISE'
Your branch is up to date with 'origin/BISECT_EXERCISE'.

The automatic process

We can automate the whole process using a script. In our case, the wrong commits are the commits where the tests are failing. Thanks to the run command we can do:

$ git bisect start
$ git bisect good 4f015ddf8a003f7d48ab316d76bc9ae36c9a619e
$ git bisect bad HEAD
$ git bisect run go test ./...
---8x--- cut ---
3e7b5549b4e9a2b68f825252cc31afdc1bff6f3c is the first bad commit

Conclusion

Wile threw himself headlong into the code without checking if anything could alleviate the effort required to fix the problem: for this reason, he wasted a lot of time on a job that could be done in seconds.

Instead of checking each commit manually, Wile could have used the bisect command, drastically reducing the effort needed to find the guilty commit (and would have avoided being made fun of by Road Runner for the umpteenth time).

It is always worth trying to gain some more knowledge about the tools we use every day!

Enjoy your increased git knowledge! If you liked this article, please give a thumbs up!