Piloting Git in a Subversion Environment

We’re looking to implement a pilot of Git to familiarize the team with the tool and introduce them to the more flexible workflow that Git supports over SVN. Because we support concurrent lines of development, we would like to forego the usual way of using git with SVN (each developer clones a branch of the repository and commits there) and try to implement a pure git experience using an intermediate “project repository”. The idea behind this repository is as follows:

  • We clone the main subversion repository using git svn
  • An empty project repository is created on the server and the Subversion clone is pushed to this repository (its an additional remote).
  • The development team clones the project repository and proceeds to do development in their own clones. Periodic pushes to and pulls from the project repository keeps each developers master branch up to date, from which they can rebase their local branches.
  • Periodically (probably daily), the originating SVN clone is ‘svn rebased’ and the new changes pushed to the project repository. These changes will be pulled by the developers on the next pull from the project repository.

In picture form, the workflow looks something like this:

Envisioned Pilot Process

Nevermind – StackOverflow ruined it for me.

Subversion and SSL Troubles

I decided to upgrade my home Subversion repository to version 1.4.3 as soon as it was released. Since then, my ViewVC application has ceased to work, getting a Python exception every time I try to execute it. Creating a small Python program that just imports the library (from svn import fs) gave me the following error:

ImportError: /usr/local/lib/libsvn_ra_dav-1.so.0: undefined symbol: SSL_load_error_strings

Thinking it was an SSL library problem, I upgraded SSL – a few times. I kept mucking with the options, rebuilding Subversion, only to get everything installed and get that same error:

ImportError: /usr/local/lib/libsvn_ra_dav-1.so.0: undefined symbol: SSL_load_error_strings

Over, and over and over again I repeated the process and got the same result. The absolute definition of insanity. This has been going on for a couple of months and I’ve been trying to address it in my spare time, as I’ve been pretty busy lately during the week and gone to the Relaxation Unit the last few weekends.

I googled my ass off to find the error, but to no avail. Finally today I ran across this thread that explained the problem. After going through my distribution directory for 1.4.4 (which I upgraded at the beginning of the month only to receive the same error) I realized that I hadn’t pulled down the Subversion dependencies tarball and rebuilt neon. So, basically I was using an old version of the neon libraries.

I finally settled on the configure statement listed here, after downloading and untarring the deps file:

./configure --with-ssl --with-apxs=/usr/local/apache2/bin/apxs 
            --with-apr=/usr/local/apache2 --with-apr-util=/usr/local/apache2 
            --enable-shared --with-libs=/usr/local/ssl

This uses the already installed apr libraries that I build with my Apache server, and ensures that the neon shared libraries are built. A quick configure/make/make install/make swig-py/make install-swig-py sequence later and my Python libraries were working fine.

I made it a point this time to document this on the Labs internal wiki, but thought I should throw this out here in public so that others can find it. Hope it helps save the weeks of frustration that I have been suffering for someone out there.

Happy building …

Video: How Open Source Projects Survive Poisonous People (And You Can Too)

Since getting a 80G iPod about a month ago two weeks ago, I’ve been really getting into watching the Google Tech Talks on Google Video. I recently watched How Open Source Projects Survive Poisonous People (And You Can Too), a lecture given by Brian Fitzpatrick and Ben Collins-Sussman from the Subversion team (now both Google employees) that summarizes a lot of information in Karl Fogels book Producing Open Source Software: How to Run a Successful Free Software Project.

If you haven’t had time to pick up and read Karls book, this video would be a good primer to some of the concepts in it and could very well motivate you to pick it up. Its an excellent book and one that I thoroughly enjoyed reading.

Practical Subversion – Second Edition

I received a free copy of Practical Subversion, Second Edition by Daniel Berlin and Garrett Rooney on Friday from their publishers, Apress.

I had reviewed the first edition before it was released and had found it to be an excellent companion to “Version Control with Subversion” (C. Michael Pilato, Ben Collins-Sussman, Brian W. Fitzpatrick), mostly due to its coverage of the Subversion API’s – which I had not seen covered in any real depth in any other book.

I have to say, the authors have outdone themselves with the Second Edition. The book is extremely well written for varying levels of Subversion experience. The beginner will find a very easy to understand introduction to using Subversion in the first two chapters, giving a really great tutorial on how to use the tool along with explanations of many of the concepts embodied in the implementation of the tool, such as locking vs. non-locking systems, properties (from file to revision properties), the basic workflow involved in using version control, and how to use the various commands, from checking out, to using svn blame (or ‘praise’ as I learned from the book is an alias for the command) to find the origin of a change in the system.

Thats just the first two chapters. As the book goes on the reader will learn about repository administration, the differences between the BDB and FSFS file systems, using Apache and Apache modules to squeeze additional functionality into the system, migrating from other version control systems such as CVS and Perforce and third party tools that work with Subversion (such as ViewVC, emacs, etc). The book also covers maintaining vendor branches, and contains a very good chapter on Version Control Best Practices. You then have, from my memory anyway, a greatly expanded chapter on using the Subversion API.

Practical Subversion, Second Edition does a really good job of covering information at many skill levels in an extremely accessible way. This book will definitely be one of those that I would put on the shelf at work as we continue to move people into more advanced roles in the management of our repositories – as there’s really nothing the book doesn’t cover.

I’ve been a user of Subversion for a very long time (I think I started around version 0.19 or so) and as I perused the book last night I walked away with some new distinctions about how we were using the tool and changes I could make to make administration and maintenance easier. That says a lot.

Congratulations to Garrett and Daniel on a fine piece of work. Hopefully the next edition will cover some of the newer features of 1.4, specifically the svnsync tool.

Building Subversion on The Mac and using Ecto for Blogging

I finally upgraded my Subversion installation on my MacIntosh to the 1.4 version. I was waiting for the “official” packages to come out so that I could just install it, but in looking at the different places recommended by the downloads page, these distributions haven’t been updated since early 1.3 releases.

I’ve had a goal to keep my Mac somewhat pristine. I decided thats not really practical. There are a lot of things that I use that I just like having built from scratch, so that I’m on the most current software and not dependent on someone else’s schedule. Subversion is one of those tools.

One thing I was shocked at was how quickly and seamlessly the build happened on the Mac. These MacBooks are pretty fast machines. I think it took a total of roughly 20 minutes (if that) to build, run tests, and install. The build on the Mac is definitely the fastest configure/check/install cycle I’ve gone through in the many installations of Subversion that I have performed over the years.

I tell you, the more I’m on the Mac, the more I like it. I haven’t run into anything that I’ve found irritating.

Its all good.

This is also the first post I am writing using a trial version of Ecto. I have to say, this application is pretty impressive too. They have both a Mac and a Windows version. I like it much better than blogging in WordPress directly – and at $17.95 a copy, its practically a no brainer to purchase.

Don’t get me wrong, I’m going to milk the 21 day trial, but it feels like this application is a pretty good fit for me.

Upgrading Subversion to 1.4

Subversion Logo

This weekend I finally bit the bullet and upgraded our production Subversion server to 1.4. The upgrade was painless, after the usual running of autogen.sh on the Neon libraries that for some reason are necessary when building on the Solaris 9 environment. I also had some weird test failures that wound up being caused by hook scripts being called on the test repositories. For some reason, Python couldn’t find one of the standard libraries, even though I had set my path and library path in my environment. A quick Google search later and all tests passed.

We have been running Subversion with the Berkeley 2.2 back-end since May of 2004, when the Subversion team first released 1.0. We’ve done a few upgrades since then, when I found the time, but wound up getting as far as 1.2.3 and my time to actually do the upgrades never really materialized. Part of that had to do with some pretty large projects that were going on that I didn’t want to interrupt the teams with an upgrade.

So yesterday, finally, I did the upgrade to 1.4. As part of the upgrade, I decided to dump and load the repositories (11 of them) to take advantage of the new SVN Diff format that 1.4 brings to the table. As part of that, I also decided to abandon the Berkeley back-end in favor of the FSFS file system.

The dumps and loads took quite a long time, about sixteen hours. However, the results were worth it. After all was said and done, we saw a reduction in server disk space usage of about 46.8%.

Not bad guys!

As I had mentioned in a previous post, one of the motivators for this upgrade were the working copy improvements, which according to the release notes, should cut down on inode usage on machines running the client. In our environment, we have about 20 people all working on the same machine (in their home directories) and we have had problems with inode max-outs in the past. Again, I’ll have to reserve judgement on these changes until we see the results, but this is one of the things that motivated me to bite the bullet and take a Saturday to do the upgrades.

The other big motivation for me, especially after yesterday was this statement from the FSFS document referenced above:

An FSFS repository can be backed up with standard backup software.
Since old revision files don’t change, incremental backups with
standard backup software are efficient. (See “Note: Backups” for
caveats.)

(BDB repositories can be backed up using “svnadmin hotcopy” and can be
backed up incrementally using “svnadmin dump”. FSFS just makes it
easier.)

Aside from normal system backups done nightly by the admin team, our nightly backups have also consisted of doing nightly dumps of all of the repositories so that, in the event that the repository server goes down, we can just load the repository on another machine (you can never have too many backups – in my opinion). The wake up call from yesterday is that it takes much longer to reload a repository on Solaris 9 than it does on the Linux work-group server I have sitting here that I did all of my original testing on a few years ago. What I thought would be a trivial load (3 hours was the estimate that I had in my head) wound up being 5-6 times that on the 2×2 Solaris box at work. It will be nice to be able to rely on the nightly system backups for these now — and perhaps just a quick tarring of the existing repository structure for quick backups (using hotcopy), rather than the dump / load process we have planned to use up until now. I’ve just never been comfortable for some reason with the Berkeley backend and backups outside of repository dumps.

Finally, I’m really looking forward to playing around with svnsync to set up a shadow fail-over repository in the event of an outage. This was something I threw on a wish list back in 2004, and I’m excited to see a tool to synchronize a “stand by repository” included in this release. There apparently is not much documentation on this new feature yet, but this article should be a good starting point.

Thanks to the Subversion team for another great release of their software. As anyone who reads this blog with any regularity knows, this is one of those pieces of software that I continuously receive value from and cannot say enough about.

Now I’m off to search for more svnsync documentation …

Subversion 1.4 Released

The Subversion team has released version 1.4 of its popular version control software. You can check out the release notes over at the official site get the the details, but here’s a summary of the changes, pulled directly from the aforementioned release notes:

  • svnsync, a new repository mirroring tool
  • Huge working-copy performance improvements
  • Support for BerkeleyDB 4.4 and its “auto recovery” feature
  • Size improvements to the binary delta algorithm
  • A handful of new command switches
  • Many improved APIs
  • More than 40 new bugfixes

I was going to post about this yesterday, but I wanted to make sure I had the software built and installed here at the Labs before throwing the link out there. The upgrade went relatively well. Since I still use Berkeley DB for some of the repositories here, I build my own software to minimize the amount of dumping / loading I have to do. Currently I am still running Berkely DB 4.2. Referencing this during the build allowed me to avoid some problems people have reported using pre-packaged distributions that upgrade Berkeley DB for you, rendering your repository useless. Building my own also allows me to get the software without waiting for the binary distributions to become available.

One note on the upgrade. I’ve been a little lax in upgrading my Apache server (also custom built) and was running version 2.0.48 or so. The new release requires an up to date version of the apr libraries, so this also forced me to upgrade Apache to 2.0.59. Overall, the upgrade was painless.

As mentioned above, this release also includes the svnsync tool, which is a repository mirroring tool. From what I’ve read so far, the destination mirror must remain read only – there is no synchronization between two duplicate repositories (at least from the limited reading material I’ve found around so far), so this release by no means invalidates the SVK tool. Nevertheless, the working copy improvements and the mirroring capability shows that the team is still on the right track.

Also noted in the release notes:

… the new working copy format allows the client to more quickly search a working copy, detect file modifications, manage property metadata, and deal with large files. The overall disk footprint is smaller as well, with fewer inodes being used. Additionally, a number of long standing bugs related to merging and copying have been fixed.

I’m going to reserve judgement on these improvements until I get the Solaris boxes at work upgraded. The working copies are really an Achilles heel on Solaris environments, where 20 or so developers use one machine to do all development. We’ve had a number of inode-maxouts over the last year or so. When I get these machines upgraded, I’ll post a follow up on the performance on Solaris.

One other enhancement I’m glad to see, the diff and merge commands now support a -c option which you can use to merge one revision between branches. This allows you to avoid using a revision range for a simple one revision merge. This should simplify things a bit …

Subversion is still, overall, the best version control tool I’ve worked with thus far (and I’ve worked with quite a few of them). Kudos to the team on the new release. I like what I see so far …