I’m probably the last person on earth to discover this – but just today, I used the Mercurial bisect command, and thought I’d write up my experiences in case anyone else hasn’t played with it before. I’d read about hg bisect in the hgbook, but never had an opportunity to use it in anger.

Here’s the problem I was seeing – in builds of xVM Server that I’ve been doing, we were producing ISO images, but after installation, the pkg command wasn’t working properly. Exploring the image a bit, with some help from the pkg python stack trace, I found the problem was that some items in /var/pkg were symlinks pointing to a non-existent mountpoint on the installed image.

Looking at the build logs from distro constructor, cpio was complaining that there was no space left on the device it was writing to. Digging around a bit more and running another build just to make sure, I found the source of the problem – we were df‘ing the source directory for the cpio, then doing a mkfile of that size, creating a lofi device that big, then creating a UFS filesystem on that device. There was the problem – the space overhead incurred by the filesystem meant that we were trying to pour a gallon into a pint pot.

So – I knew what the problem was, pulling the tip changeset from the distro constructor even showed me that the problem was already fixed (my favourite kind of bug!) – the fix being to make the file which backs the lofi device just a bit bigger. My question was, what changeset introduced this fix? Enter hg bisect.

With it, you just need to identify where you know the code is bad, and where you know the code is good, and a test to determine whether the change is present. In my case, the test was really short:

grep "Add 1%" build_dist.lib

– but you could conceivably have the test build an entire OS image, install it, and check for the change. The bisect command then does a chop through all of the changesets, narrowing down to where the change was introduced.

In my case, a source tree of 105 changesets resulted in my only having to perform 6 tests to determine where the change occurred. A grep across 105 files would have completed in no time, but had I actually needed to build an OS image for each test, 105 builds would have taken a very long time indeed.

Here’s some edited highlights:

timf@haiiro[435] hg bisect -g tip
timf@haiiro[436] hg bisect -b 0
Testing changeset 52:42e67ad1e103 (105 changesets remaining, ~6 tests)
125 files updated, 0 files merged, 5 files removed, 0 files unresolved
timf@haiiro[438] grep "Add 1%" build_dist.lib
timf@haiiro[439] hg bisect -b
Testing changeset 78:76e8ef490770 (53 changesets remaining, ~5 tests)
119 files updated, 0 files merged, 95 files removed, 0 files unresolved
timf@haiiro[447] hg bisect -b
Testing changeset 103:b8d33c12a531 (4 changesets remaining, ~2 tests)
2 files updated, 0 files merged, 0 files removed, 0 files unresolved
timf@haiiro[448] grep "Add 1%" build_dist.lib
# Calculate the size of the pkg data directory.  Add 1% of the
timf@haiiro[449] hg bisect -g
Testing changeset 102:ef08a25b1d1c (2 changesets remaining, ~1 tests)
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
timf@haiiro[450] grep "Add 1%" build_dist.lib
# Calculate the size of the pkg data directory.  Add 1% of the
timf@haiiro[451] hg bisect -g
The first good revision is:
changeset:   102:ef08a25b1d1c
user:        Karen Tung 
date:        Wed Aug 06 20:22:36 2008 -0700
summary:     2810 pkg archive size not big enough sometimes

So – I need to update our copy of distro_constructor to be based on changeset ef08a25b1d1c, which gets me the fix for 2810. Yahoo!