Archive for May 2008
bzr, git, and hg performance on the Linux tree
OK, so I just did a historical comparison of git and bzr performance using the Linux source tree. One of the comments I got was “what about Mercurial?” Fair enough. I’ve really never done much with Mercurial because Ubuntu primarily uses bzr and git is what most of the other people I know using a DVCS use. However, there are a lot of projects using Mercurial, Mozilla being probably the most notable one. So, here’s a comparison of bzr and hg. You may want to read my previous post for details on the steps I’m doing.
Repo Initialization:
git bzr hg
0m0.086s 0m0.334s 0m0.137s
1 : 3.88 : 1.59
Add 2.6.0 Linux tree:
git bzr hg
0m14.269s 0m4.852s 0m2.526s
5.65 : 1.92 : 1
Commit 2.6.0 Linux tree:
git bzr hg
0m10.263s 0m43.968s 0m30.890s
1 : 4.28 : 3.01
Diff after copying in 2.6.25.2 Linux tree:
git bzr hg
0m24.425s 0m51.158s 0m37.846s
1 : 2.09 : 1.55
Committing large changes:
git bzr hg
0m28.468s 1m8.627s 0m47.948s
1 : 2.41 : 1.68
Diff after no changes:
git bzr hg
0m0.343s 0m47.448s 0m1.340s
1 : 138 : 3.91
Getting repo status after no changes:
git bzr hg
0m1.230s 0m4.027s 0m1.077s
1.14 : 3.74 : 1
Committing a trivial change:
git bzr hg
0m0.397s 0m9.010s 0m1.913s
1 : 22.7 : 4.82
Repository size (just VCS control directory):
git (gc) bzr (pack) hg
92 MB 112 MB 179 MB
So, Mercurial performs quite well. It generally sits somewhere between git and bzr. Hg runs somewhere around 2.75 times slower than git in the tested operations. Bzr runs around 5 times slower with the notable exception that bzr diff when there are no changes is 138 times slower than git and 35 times slower than Hg.
git/bzr historical performance comparison
OK, I know git vs. bzr has been beat to death and that bzr speed seems to be often cited as its “Achilles’ heel“, but I was in #bzr the other day and somebody (a git fan I take it) said something to the effect of “well, bzr couldn’t be used to work with the linux kernel tree, that’s what git was made for”. Now, I have no experience of working on the linux tree, but it got me to thinking about if anybody had done any benchmarking of that kind of operation.
After some googling I found an old blog post from 2006 by Jo Vermeulen where he did some basic timing of common tasks such as adding files, doing diffs, commits, and finding repo status on the Linux 2.6 kernel tree using both git and bzr. Since both git and bzr have come a long ways since 2006 I thought I’d replicate Jo’s comparison (with git 0.99.9c and bzr 0.7pre) using current (by Ubuntu 8.04 standards anyway) versions of git (1.5.4.3) and bzr (1.3.1). So, here’s the results:
First we unpack a Linux 2.6.0 tarball into linux-bzr and linux.git directories, then initialize the repos:
Initialization:
git (old) bzr (old) git (new) bzr (new)
0m0.161s 0m1.593s 0m0.086s 0m0.334s
Nothing exciting so far. Now we tell the VCSs to track the files via bzr/git add :
Adding files:
git (old) bzr (old) git (new) bzr (new)
0m42.121s 0m31.870s 0m14.269s 0m4.852s
In this case bzr not only wins in terms of absolute speed, but also in proportional gains with time. The git:bzr ratio in 2006 was 1.32:1 and now it’s 2.93:1 . Jo didn’t mention in his comparison how long it took him to then commit the initial 2.6.0 tree we added but for me it was 0m10.263s for git and 0m43.968s for bzr, a pretty clear win for git.
Next we’ll untar the latest 2.6.x kernel into our repos. Jo used linux-2.6.15.4 and I used linux-2.6.25.2. Perhaps I should have used the same version he did but considering we’re using entirely different hardware I don’t think our results are directly comparable anyway. OK, so now we want to see how long it takes to diff the changes:
Diffing changes:
git (old) bzr (old) git (new) bzr (new)
2m26.982s 1m13.869s 0m24.425s 0m51.158s
This is one of the more fascinating results in my little experiment. The 2006 results gave a git:bzr ratio of 1.99 whereas my new results give a ratio of 0.48 . Apparently git has done a lot of work on speeding up diffing.
Next we commit our new 2.6.x changes:
Committing large changes:
git (old) bzr (old) git (new) bzr (new)
0m54.964s 2m4.757s 0m28.468s 1m8.627s
so an old ratio of 0.44 and a new ratio of 0.41: not a lot going on there.
A really interesting test that Jo did was to do a bzr/git diff right after committing. Ideally this would take no time at all as we haven’t done anything since the commit, however:
Diffing no changes:
git (old) bzr (old) git (new) bzr (new)
0m0.057s 3m51.918s 0m0.343s 0m47.448s
Back when Jo did his experiment the git:bzr ratio was 0.00025! Ouch. My results gave a ratio of 0.0072. In this case bzr has been gaining a lot of ground but it’s still rather remarkable how long it takes to diff when there are no changes.
The other things we would often do is a bzr/git status to see what’s going on:
Getting repo status:
git (old) bzr (old) git (new) bzr (new)
0m0.442s 0m19.711s 0m1.230s 0m4.027s
The original git:bzr ratio was 0.022 and for the new one 0.305 so bzr has gained by an order of magnitude but still lags a bit.
Lastly, we look at what happens if you make a minor change (let’s just add our name to MAINTAINERS for fun) and then commit:
Small commit:
git (old) bzr (old) git (new) bzr (new)
0m7.364s 2m6.685s 0m0.397s 0m9.010s
The times I got for both git and bzr are significantly faster than what Jo got in 2006. His git:bzr ratio was 0.058 and mine is 0.044, so some marginal gain by bzr here.
A last interesting note of comparison is the storage size that the VCS takes up. After all the operations above my .bzr directory is 112MB (or 23% of the total size of the repo+working tree) and the .git directory is 162MB (or 30% of the total size) so it seems that bzr has a bit better storage compression.
OK, so now the question is, what does it all mean? Well, I’m not entirely sure to be honest. When it comes to my original question of “Would bzr be usable working on the Linux tree” I would think, at least when it comes to common local operations, that the answer would definitely be yes. It’s not the fastest thing around but it’ll get the job done.
I use both git and bzr on a regular basis and both are exciting and have their own strengths and weaknesses. Git is no doubt very fast, though I think other DVCSs are starting to catch up. Bzr is very user friendly and has great plugins. It’s really a cool time for code sharing, in my opinion. Rock on!
