 |
|
 |
 |
|
|
Say hello to Mercurial, my long-overdue replacement for CVS.
Unlike CVS and Subversion, Mercurial is a distributed version
control system (VCS), which means (among other things) it doesn't
have a central repository, has disconnected (non-networked)
commits, and allows you to group small changes together as "change sets".
Other well-known distributed VCSs include Bitkeeper, Git,
Darcs, and
Monotone (there are more). While searching for a CVS
replacement, I spent some time using Subversion, Monotone, and Git;
here's a brief overview of my experience with each one.
Subversion: Subversion is probably the most popular VCS, so
you're probably already familiar with it. I'll dispense with the
pleasantries and skip straight to the problems.
For the past several months I've been using a private, home Subversion
repository for small projects, snippets of code, configuration
files, scripts, and various other knick-knacks. Along the way, I
noticed several things about Subversion that bother me. For example,
until version 1.4, common operations like svn status and svn
commit were uncomfortably slow under Subversion. They're better now,
but still not as fast as I'd like. Copying and moving large groups of
files is still painfully slow (moving several hundred megabytes of
files took me well over 20 minutes).
Branching in Subversion is primitive (and slow, since it's really just
a copy). For me this is a major problem, because in addition to
revisions, I also want to use branches for quick, version controlled
staging areas for new features. That's a problem in Subversion,
because branches are expensive, and merging is kind of wimpy.
It's a genuine hassle to require network access for commits; I
regularly work remotely, and even though I have VPN access (courtesy
of OpenVPN) it's still kind of distracting to wait for common
commands like commit, add, and copy . My alternatives? Move the
repository to a public server with better bandwidth (which makes it
slower for me to access while I'm at home, plus it's not really
private any more and I'm still dependent on network connection) or
hold off on commits until I'm at home (which is contrary to committing
in small, incremental changes, my preferred modus operandi).
Finally, and most importantly, Subversion is centralized. Why? It
imposes all sorts of workflow restrictions that haven't been
necessary since VHS tapes went out of style. For example, I have
roughly five gadzillion projects in various states of brokenness and
disarray that I'm just not ready to publish. Distributed version
control systems have no central repository except one that is
designated by convention, so I can commit locally, push to my private
repository when it's convenient, and publish to the public repository
when I'm damn good and ready. Subversion can't do any of this without
cheating, of course, so I'm forced to either migrate projects to the
public repository without their history or use svndumpfilter
chicanery to bludgeon Subversion into doing something it should be
able to do out of the box. Which sounds an awful lot like trying to
copy and move files in CVS. Which is why we were supposed to upgrade
to Subversion in the first place. Oops...
(Subversion isn't all bad, by the way. It's certainly a huge
improvement over CVS. It integrates well with Rails,
Eclipse, and all the other fancy toys kids use these days.
Plus Subversion has all sorts of nifty extensions like TortoiseSVN
and Trac. I use Subversion daily at work. I just need things
Subversion doesn't support by design).
I'd be derelect in my blog posting responsibilities if I didn't
mention SVK, a distributed VCS built on Subversion that supports
repository mirroring and disconnected operation. I can't say much
about SVK because I don't have much experience with it, although I'm
fairly sure SVK has neither the speed nor the power of Mercurial and
Git. Personally, I don't really see the point of keeping Subversion
around for the sake of keeping Subversion around, particularly in lieu
of Subversion's marvelously atrocious track record with repository
corruption.
Monotone: I wanted to like Monotone. I stumbled across a
reference to it in the SQLite documentation, and spent several
months putting up with Monotone's warts after Linus plugged it on the
LKML. I like the extensive documentation, simple
command-line interface, Lua hooks, proper Windows support.
Internally, Monotone makes extensive use of strong cryptographic
primitives, which I wholeheartedly support.
Unfortunately, Monotone is slow. Dog slow. An initial repository pull
(checkout, in CVS parlance) is so slow that a many Monotone
users provide a publicly downloadable snapshot of the initial pull
instead. The last time I used Monotone, the crypto certs were their
own special blend; I'd prefer either OpenPGP or X.509.
(Oddly enough, my first look at Mercurial was right after I started
testing Monotone. I wasn't initially interested in Mercurial because
I was still stuck on Monotone. I didn't feel like Mercurial offered
much more than Monotone, and I hadn't fully appreciated the speed
difference between the two).
Git: Git was (is) right at the top of the list. It's fast,
possibly (probably?) even faster than Mercurial. It has features
Mercurial doesn't support (rebase, for example, although I believe that can
be clumsily emulated with bundle and unbundle). Keith Packard
wrote a post titled "Repository Formats Matter",
advocating Git for X.org. His post briefly mentions Mercurial and in
a positive light, but dismisses it prematurely for what I think is a
completely asinine reason; old, obscure ftruncate() bugs in the
Linux kernel (see this post on the Mercurial
mailing list for a more thorough rebuttal of Keith Packard's
ftruncate() sillyness).
I only have two real gripes with Git: the Windows support sucks (it
half-works via cygwin, which doesn't really count), and the
command-line interface makes me feel stupid.
The second one is the deal-breaker for me. While I may not be
Mensa material, I've spent enough time using version control that I
feel like I should be able get at least the gist of a new VCS in a
couple of minutes, be comfortable with it within a day or so, and
proficienct with it within about a week.
I don't really think that's unreasonable. Even if it is, so what? A
VCS is a tool, one that's supposed to make my life easier. If I
can't use it without consulting the documentation every couple of
minutes then it's just getting in my way.
I simply refuse to waste my time learning the nuances of an interface
that is complex for no other reason than the programmer couldn't see
far enough past their own idiosyncratic whims to long enough provide
an interface without the learning curve of a black diamond ski slope.
This is particularly true for an application like Git that has few,
if any, tangible benefits when compared to it's more intuitive
counterparts.
The silver lining here is that I eventually stumbled on Mercurial. And
by stumbled, I mean Richard (richlowe) told me about it
(just like he told me about Vim, Screen, Mutt, Ruby, and
a whole
lot of other cool stuff I use regularly). He knows a lot more about
version control software than I do, but I didn't really pay any
attention. At least not until I noticed that Mercurial seemed to be the
only free VCS that wasn't enclosed in a long and colorful string of
profanity when he talked about it.
Anyway, the more I use Mercurial, the more I like it. It meets all of
the requirements I mentioned above, plus it has the speed and power of
Git and the simplicity of Subversion and CVS. Mercurial is actively
developed, has full Windows support, and it includes extensions that add
support for PGP-signed tags and Quilt-style patch queues.
The real killer feature for me, though, is that everything I try just
works. Setting up read-only, web-accessible public repository only took
a minute or two of reading, and making an entire directory of Mercurial
repositories available only took a couple more minutes. I had
comparable experiences with branching, tagging, signing tags, and
pushing changes to multiple repositories.
The only warts I've found in Mercurial so far are minor; the web
interface needs a bit of cleanup, and there should be a straightforward
way of adding repository defaults like style, contact and archive
formats via the top-level htwebdir configuration file. The native
import features are still a bit lacking, although you can use Tailor
to convert data from all but the most esoteric or convoluted
repositories.
That's about all the advocacy I can muster up at the moment. If you're
interested in reading more about the state of distributed version
control systems, there are more detailed VCS comparisons here and
here.
Note: I've had this post sitting in my queue for months. I
just brushed off the cobwebs, cleaned up the typos, and posted it.
In that time Mercurial has picked up a bit of publicity, and development
has been moving along at a steady clip. I tried to remove the bits that
no longer apply, but let me know if I missed anything.
Edit: This article was linked on Reddit; some additional conversation (and
my responses) can be found in the comment
thread.
|
|
 |
 |
|
 |
|