mercurial-server on CentOS or RedHat 5.x

So, as per my previous blog post, I got mercurial-server set up on one of our Ubuntu servers just fine.

Then I tried on one of our cPanel based CentOS servers.

What a pain in the ass!

There has been an effort to get this working, somewhat, by adding a new target to the Makefile: `setup-useradd`.

This uses the RedHat/CentOS `useradd` command in place of Ubuntu’s `adduser`.

Unfortunately, because of the way the Makefile is set up, if you don’t have all the prerequisites for building the docs, the product only gets half-setup.

Rather than get all the prerequisites, I found it simpler to just whack the Makefile around to remove the documentation build from the Makefile.

Find the line that says:

`installfiles: installetc installdoc pythoninstall`

and replace it with:

`installfiles: installetc pythoninstall`

You’ll also have to remove the –system parameter to the `useradd` call on line 62 of the Makefile since it’s not supported and is not necessary.

Do those things, then use:

	# make sudo setup-useradd

and everything else will go just fine. If you need to build the docs, you’re on your own, I have no interest in getting CentOS all set up for this unnecessary step.

Serving Mercurial over SSH

So…

I’ve been deploying our big Django application to all of our servers using a pretty slick setup with Fabric and rsync.

This worked fine when I was the only developer, working on my local machine and pushing to a Mercurial repository on one of our internal Ubuntu boxes. Since it was all local, I just used the server’s Apache setup and mod_wsgi and didn’t worry about security too much. The Linux box is completely firewalled off from the Intertubes.

However, as we get more people working on the code, and as we deploy to more servers, having the ability to update to and from our Mercurial repository is becoming more important. Just ask Jeff(rey), whose template changes I clobbered this afternoon.

Since the shared hosting on which I wanted to set up my Mercurial repositories doesn’t have mod_wsgi, nor is it really safe to put a ‘foreign’ module like that into a cPanel setup, I had to find another way to serve mercurial securely.

Also, since we’re finding that Apache and its threading model are consuming waaaaaaaayyyy too much memory under load, we’re moving toward lighter weight, single purpose servers for everything anyway.

So…I found mercurial-server.

It gives secure, tightly controlled Mercurial access over a simple SSH connection.

After I found the `.deb` file and was able to use `dpkg` to actually install it, things went pretty smoothly.

Setup instructions are pretty straightforward with the only part that confused me a little was extracting the key from SSH Agent with `ssh-add -L`. I wasn’t using SSH Agent so the directions didn’t work but once I figured out that al I needed in-hand was the public key, I was on my way.

The repositories are kept squirreled away in kind of an odd location that’s not mentioned anywhere, but that doesn’t seem to be much of an issue as long as that directory tree (`/var/lib/mercurial-server/repos/`) is getting backed up, I’m fine with wherever it wants to put it. The reason for that location, far as I can figure, is that the /var/ tree is supposed to be for things:

/var/ Variable files—files whose content is expected to continually change during normal operation of the system—such as logs, spool files, and temporary e-mail files. Sometimes a separate partition.

At any rate, it took a couple of hours to get set up right and, while speeds don’t seem to be quite as good as they were running under Apigche, it’ll work just fine for my automated setup.

Now, to figure out how to get hooks to do all the necessary pushing and pulling for me…

To see how to get this running on CentOS, see my next blog post. What a PITA!

Create New BitBucket Private Repository

So, I’ve finally bitten the proverbial bullet (or was that bucket ?) and am now an official, paying BitBucket subscriber. Yay me, Yay BitBucket.

So…naturally, the first thing I want to do is put a couple of my locally hosted (and, therefore not well off-site-backed-up) repositories up.

Here’s how to move local repositories to BitBucket, step-by-step

Before we get started:

MAKE A BACKUP OF EVERYTHING.

Some of the commands we’ll use will destroy version control files, entire directories, and could wipe out all of your work.

If you have a backup, you can just restore from backup, mark me as “idiot” in your contact manager, and that will be that.

In any case, by reading further, you are agreeing that you have a backup, that you know how restore from it if something goes wrong for any reason, and that you agree not to blame me or try to make me responsible for anything that goes wrong.

So, here’s how you do it (mostly for my future reference):

We’re going to use the repository name `stuff` and the directory that `stuff` currently resides in will be `~/stuff` i.e. `stuff` in your home directory.

Create the repo on BitBucket by going to http://bitbucket.org/account, click Repositories on the top menu bar and pick “Create New” from the menu.

Fill in the simple form and press “Create Repository.”

You will be brought to the home page for that repository.

Change to the directory above the directory containing the stuff you want to put into the repository.

Remember, we’re assuming your `stuff` project is contained in `~/stuff`, so change to your home directory (NOTE: a simple `cd` will do it on *nix).

Change to your home directory and move the current set of files aside:

# cd ~
# mv stuff stuff.sav

Now, I prefer to use ssh for transferring anything anywhere, ever, so I’ve got my system setup for public key access to BitBucket. If you use the http: method of accessing your repositories, you’ll have to modify the instructions below to just use the http equivalents of the commands I use below.

Copy the clone command from the BitBucket web site, in this case it was:

 hg clone ssh://hg@bitbucket.org/ssteiner/stuff/

You should get something like:

(~/)# hg clone ssh://hg@bitbucket.org/ssteiner/stuff/
destination directory: stuff
no changes found
updating working directory
0 files updated, 0 files merged, 0 files removed, 0 files unresolved

You now have `~/stuff.sav` containing your original code and `~/stuff` containing the checkout of nothing.

So, now, copy the actual stuff into the repository:

# cp -r stuff.sav/* stuff/

And change to the repository directory:

# cd ~/stuff

Now, check Mercurial’s status to see that it doesn’t know anything about anything:

# hg st

You should see all your files listed since nothing’s tracked by Mercurial.

WARNING!

If you were previously tracking any of the components of the `stuff` project with another version control system, you may have a whole collection of files related to those systems.

In the project I was using for this article, I had remnants of Subversion and Git, adding many, many wasted bytes to the checkin.

Fortunately, I found it before I checked it into BitBucket.

I got rid of all that cruft with:

# cd ~/stuff
# find . -name ".svn" -exec rm -rf {} \; -print
# find . -name .git -exec rm -rf {} \; -print

WARNING!

The two commands above can remove a lot of stuff.

If you run it in the wrong directory, it can remove all of your version control files from every directory in your system.

Please be careful and understand what you’re doing, and see the disclaimer at the beginning of this page.

Put it all under Mercurial

Put everything under version control with:

# hg add *

Commit it:

# hg commit -m "First commit of stuff project"

Push it up to BitBucket:

# hg push

Global .hgignore

Now that I’ve pretty much switched to Mercurial (though I don’t have a great server setup yet), I wanted to simplify my .hgignore files.

What I wanted was something like a ~/.hgignore file in my home directory to exclude “the usual” so I could make the local .hgignore be specific to the current project.

I found several references that said to use:

[ui]
ignore = ~/.hgignore

But that didn’t work.

The solution I found, after much Googling, was to modify the project’s .hg/hgrc with:

[ui]
ignore = ~/.hgignore

Couldn’t find that in the Mercurial documentation anywhere and not sure how to get new repositories to automatically include that chunk in their .hg/hgrc but the .hg/hgrc file doesn’t seem to exist by default.

Anyone with more info, please feel free to comment, I’d love to have my .hg/hgrc generated automatically on hg init with that and any other things that might be useful.

Comments, please?

Thanks,

S

Bitbucket offline for hours, reminded me of backups…

BitBucket Was Down. For a Long Time. Relatively Speaking.

So, as many of you will already know, BitBucket, our favorite Mercurial hosting service, went down.

For a while.

Seemed like a long time.

This reminded me of two things, in particular:

  1. DVCS — why we only use DVCS now
  2. Backups — making sure it’s ALL backed up AUTOMATICALLY.

DVCS, ONLY

What if BitBucket had never come back up?

With Mercurial (or Git, or Bazaar, or, I think, darcs), you have a complete copy of the repository. Not just the latest, everything. So, if the external repository blows up, everyone working on the project has a copy of everything as of the last time they sync’d with another copy.

Interestingly, this is exactly how DropBox works; you (and everyone else) has a complete copy of all of the files in the DropBox giving you fresh copies of everything, on every machine, as of the last time it was connected to the network.

Backups!

Ok, I have backups of everything, in lots of places, for almost everything.

But, I noticed, in what would have been an “Oh, crap, too late” type of way, that I didn’t (and don’t) have backups of everything on BitBucket.

IOW, while my repository would have been as safe as my last pull or update but I would have lost the issue tracker, and wiki.

I don’t have a solution right this second, but I’d like to collect community comments about this so we can develop and post (on BitBucket) a solution to “How do I make my DVCS hosting on BitBucket cover all of the things I have up on BitBucket, not just the source code repository.”

Thanks for any comments, I’d love a solution so that, if BitBucket ever were to fail completely, we are all sure that we’ve got one or more copies of everything, and it has to be completely automatic.

S

Snow Leopard vs. virtualenv – easy_install virtualenv==dev != latest

So there I was, merrily plooking along with my various Python projects and had occasion to make a new virtual environment using `mkvirtualenv` from Doug Hellmann’s excellent virtualenvwrapper.

And it hung.

I ctrl-C’d out after a few minutes and tried again. Hung.

Figured it might be a Snow Leopard thing, so I did a quick:

	# easy_install virtualenv==dev 

Figuring that’d get me the latest version.

Same thing.

Poked around in the source looking for a clue for a minute, then did the obvious; Googled for the error message.

Which lead me to this post.

Turns out that easy_install grabs from a subversion repository that’s not quite up to date with the new code up on bitbucket.

To quote that post:

Turns out that triggers an install from the Subversion repository at colorstudy.com which *doesn’t* have the Snow Leopard fix, but is also labeled as version 1.3.4dev. So I guess I was chasing my tail a bit.
I should have done this:

> easy_install http://bitbucket.org/ianb/virtualenv/get/tip.zip

That gets the virtualenv with fix I was after, and indeed does work.

So, the lesson is: in this time of projects moving off of their own little subversion repositories and onto bitbucket and github, and easy_install, out of the box, supporting only subversion and CVS (which I won’t dignify with a link), check your assumptions about which version of what you’ve got installed; sometimes things LIE!

Hopefully, this will save someone else some time and trouble.

P.S.
Speaking of subversion, I’m hoping to get to use the setuptools Mercurial plugin working sometime soon since most of my new projects are on Mercurial, but I’ll probably wait until I convert over to Distribute which may get it built in sooner rather than later.

Mercurial Tags Are Handled Oddly

Mercurial tags are a little odd.

The canonical reference.

The odd thing is that the tag is not included in a checkout of the tag.

I don’t seem to remember any other system behaving this way.

Usually, When you check out a tagged revision, the tag is included in the checkout.

Apparently, in Mercurial (from the wiki):

Common wisdom says that to avoid the confusion of a disappearing tag, you should clone the entire repo and then update the working directory to the tag. Thus preserving the tag in the repo.

Ok, then, that doesn’t make any flippin’ sense at all.

That’s all for now!

Converting from Git to Mercurial

So, as per my previous post, we’re going to be using Mercurial going forward.

I have several projects in Git already and was doing a conversion to Mercurial.

I did the simple:

	# rm -rf .git 
	# mv .gitignore .hgignore
	# hg init
	# hg status

Unfortunately, this gave me:

	abort: /super-secret/.hgignore: invalid pattern (relre): *~

WTF?

Turns out, that for Mercurial, you have tell it the format of your .hgignore file.

Inserting:

syntax: glob

Fixed it right up and I’m off and running.

Into the breach!

The WSSW Stack

Choosing The Stack

Ok, so I’ve been plooking around with various web frameworks, even languages, for a couple of years now.

Now, while starting WebSauce Software for real, it’s time to choose a standard toolset. This is what we are going to use to produce our software until further notice.

Unless there’s a compelling reason to change, this is what we’re using.

If something great comes along to replace a component then fine, but it’ll have to be pretty damn good for us to switch.

If it’s great, we’ll switch.

Adapt or die!

First a little history.

We got into the web business about 7 years ago after 25 years of general purpose contract programming which overlapped with about 10 years of software publishing.

I started consulting in about 1982, started publishing software in about 1986, stopped publishing software in 1994, and retired from the software business, sort of, in 1995, had my first son in 2002, and went back to work in 2004-ish.

I did some consulting between 1995 and 2004, but only a handful of really complex, challenging jobs. I was not making a living, I was just taking on work I liked and wanted to do.

When I went back to work, I didn’t know exactly what type of work would be coming up and I wasn’t too worried about it. I’ve always managed to keep busy.

Unfortunately, I had been out of the loop for almost 10 years so most of my old consulting contract clients were gone, companies changed hands, engineers at those companies moved around to parts unknown. In short, I didn’t really have any contacts any more.

So, I rented an office and hung my shingle out to see what would happen.

People kept asking me if we did websites.

So, I said we did.

Now, it’s not that we hadn’t done websites before that for ourselves or for customers, but we weren’t in the business of making websites for other people.

So, now we were, and we did.

Lots of them.

We grew, hired people, had clients, had a stream of new clients, a few big clients, I wrote some nice tools for in-house use that made us more efficient than other companies, we learned the web development business and everything was hunky-dunky.

Except…

I hate making new websites for people who don’t already have them.

They have unrealistic expectations of what the site can do for them and especially, how much it should cost. At least people with existing sites have an idea what things cost, and know what the site is doing or not doing for them.

Improving an existing site is way better, for us. Less friction, better
results all’round.

What I do like…

Fixing existing sites

Fixing up an existing site is a blast. We get to leverage all of our cool tools and, because of those tools, we’re very efficient at it. Because of our efficiency, clients get a better deal that they did from their prior company which makes us look good and, since almost everything is automated, we make good profit margins.

Best part? I get paid to spend time ploinking on the tools we use to do customer jobs more efficiently which is the most fun for me.

Doing SEO

Getting sites to rank well in the Search Engines, making sure that their customers can do useful things with their website, and generally helping our customers serve their customers better.

My software engineering background has allowed me to write some tools that do things in this area that nobody else has. We’ll be publishing some of them soon. We’ll let you know ;-).

Writing web applications

Things that are kind of like desktop applications but run in a browser and do things that are appropriately web based. We’ve done SalesForce.com integration, custom database editing applications for real estate brokers, inventory control and management against existing, legacy databases that just need a new view to be more useful than they already are, all kinds of stuff. Love it.

How I’ve Written All This Stuff

I’ve written utilities for doing the repetitive parts of SEO and also written web applications for various purposes for clients and for in-house needs.

I was always hunting for the best development toolset both for client applications and for our own internal tools.

I’ve gone through a lot of tools.

So I tried…in no particular order

and God knows how many other frameworks, version control systems, WSGI components, templating languages, and chunks and parts of various solutions.

So…I’ve finally settled

So, after all that trial and error, here’s my toolset.

This is what I’m using from now on unless there’s a compelling reason to use something else. Most of the bigger tools (Django, for example) have or are developing plug-in parts for things like the templating system so these choices are not as rigid as having this list might imply.

Linux Distribution: Ubuntu

I’ve used just about every Linux distribution at one time or another, we host lots of sites on the Centos series, I think one of our in-house boxes is Suse. Then I started using Ubuntu since it seemed to be the one most of the documentation for the tools I was using was written for. I figured there must be some reason for that since it was just too pervasive to be a coincidence. Not a coincidence. It just works better. All of our cloud servers are now fired up with Ubuntu 9.04 server configuration and I run Kubuntu (I absolutely hate Gnome, love KDE). I’m envious of the MacOS-X Aqua theme, only for Gnome so far, but it’s not enough of a reason to switch to Gnome.

Ubuntu has been rock solid, and apt-get blows away any other system package tool I’ve used (yum, nasty RPMs, etc.).

Language: Python

The language I always come back to. I’ve tried other languages. Seems like I’ve tried every other language at one time or another. Last time I counted it was, like, 40 or something including dialects of Basic, Pascal, C, C++, Delphi, various Assembly languages, Perl, Ruby, Awk, SmallTalk, Lisp, Sed, Haskell, and many, many others I can’t even remember. I don’t remember who said it but Python really is executable pseudo code

VCS: Mercurial (hg)

Up until a few months ago, we were Subversion users. I feel dirty even saying it, now. We used Perforce for one job but I hated it the whole time. I always found Subversion annoying; especially trying to merge branches.

The centralized repository always gave me an uncomfortable feeling I never identified until I started using Git on an Open Source project I was working on.

The first time I did a merge, I was hooked. It was painless and it wasn’t a trivial merge either. I had to manually resolve one conflict out of 30 or so changes. It took five minutes. It would have taken all day in Subversion and I would have been swearing the whole time. I was leaning toward Git, not having used any of the other likely suspects much until this announcement.

Then, there’s Google’s support which double sealed the deal.

Since the main Python repository is going to be Mercurial, and since that will likely drive adoption on other projects that have yet to move out of Subversion, and since Mercurial is written in Python, it would be silly to use anything else since there’s really little obviously superior about any other DVCS.

Mercurial is also sure to get lots of loving attention and will pass Git in short order in any area where it’s currently lagging. Fortunately, Git, Mercurial, and Bazaar are similar enough that it’ll be easy enough to switch around when needed.

WebSauce’s projects will all be DVCS’d in Mercurial and I’ll document the setup as soon as I get around to it. The setup, that is…

Desktop App Development: Cocoa/Objective-C

I tried writing my first OS X Application for publication using Python and PyObjc. I had a working prototype but, even with expert help, couldn’t get it to run anywhere but my development system. Next app is pure Objective-C and, if I need Python for something, I’ll run it as an external process and work on getting the results back some way other than being running inside the main application space.

Web Framework: Django

I may not like some of the parts of the Django stack so much but it all hangs together well and, if I get sufficiently dissatisfied with any particular part, I’m sure there will be a way to “fix” it on my own checkout and submit a patch. I’m pretty sure most of the Django pieces are pluggable to some extent and, where they’re not, it would be good of me to help make them so. That’s what Open Source is all about, right?

Web ToolKit: Twisted

Twisted does so many things, and our applications need so many of them, that it’d be silly not to use the grandfather of all things Python and Web.

Sure, it’s a little hard to wrap your head around in the beginning, and there are parts that are dark, deep, and mysterious, but I’ve been hanging around on the mailing list and IRC channel and I’m confident that if I run into a problem, and do my research before asking for help, that I’ll be able to get any problem solved in relatively short order.

Because so much of the rest of our apps require Twisted services, we’re going to run our Django app using Twisted’s WSGI unless we run into problems, Then we’ll fall back to eiter CherryPy’s WSGI, Apache’s mod_python, Apache’s mod_wsgi. Whatever, not a big deal.

Documentation Language: Restructured Text

The documentation format of Python that can be easily converted to everything else.

It’s human-readable in source form, intuitive, and is everywhere in all the tools I use.

No brainer.

Other Tools

The stack really isn’t worth anything unless you can deploy it.

For that, I’m relying on several other Python based tools:

Paste

I’m only using the directory template creation of Paste. Paste is for the most part, overgrown and under-focused but the directory templating works well enough for now.

virtualenv, virtualenvwrapper

These allow me to set up an isolated Python environment in which to run my applications. Keeps all the cruft out of the system and gives me an attainable target to deploy.

zc.buildout

Allows creation of a completely self-contained app. Virtualenv’s great for development, but this wraps it all up in a one-stop-shopping bundle.

fabric

Makes deployment as simple as writing a Python script that does what you want to distribute an application to wherever you want to deploy it.

Sphinx

The documentation tool used on the Python project itself. You can set up a documentation structure in one command, write your docs in reStructuredText, and have it in html, latex, and several other formats in a flash.

github/BitBucket/LaunchPad

Not really part of the deployment stack but from having worked on several open source projects on github with git, I think it’s about the best there is right now. I’m still interested in looking at BitBucket and I’m contributing to a few projects there as well but Github seems to be more mature and has a much more informative and useful interface. LaunchPad is very ambitious, and seems well thought out and pretty all-encompasing. Unfortunately, the only backend it supports is Bazaar. Yuck.

Basecamp

We’ve been using Basecamp for a while now for project management. It’s not perfect but it is the best shared system we’ve found. We’ve tried Google Docs and got addicted to shared documents but the rest of the system doesn’t provide any project management functionality so things tended to get lost in there since there was no way to indicate what was to be done next. Basecamp also has shared documents (Writeboards) and also ToDo Lists and Milestones which make it possible to keep a project moving.

FogBugz

We’re currently using FogBugz to track our bugs in the OS X product that we’re untangling the Python code from and it really is a great bug tracking system.

We’ve been focused mostly in BaseCamp so it will be interesting to see how well they integrate or whether we move to another system for this functionality. An obvious choice would be Trac and, with the buildout script, maybe it won’t be so abominable to install.

For now, that’s it.

I’ll be updating this as I update the toolset but this is it, for now…