S3FS on CentOS

So, we’re using CentOS 5 for some of our servers; the ones we need cPanel for.  These are our shared setups with people running blogs, Joomla, Drupal, and such.

I’ve never liked FTP for anything due to its insecurity, slowness, and its inability to recover from even the simplest  of errors.

So I finally got our server provider to build a kernel with FUSE support built in so that I could use s3fs to mount an Amazon S3 bucket as a normal mount.

It was a little annoying to set up the first time but, when I had do it a second time, and had to go all the way back to the beginning, I figured I’d better write it down this time.

Install Subversion

First step is to get s3fs from its site at: http://code.google.com/p/s3fs/wiki/FuseOverAmazon.

I prefer to check out from Subversion but Subversion was not installed on my server.

A simple:

	# yum install subversion

 

gave me an error about a missing dependency:

Error: Missing Dependency: perl(URI) >= 1.17 is needed by package subversion-1.4.2-4.el5_3.1.x86_64 (base)

 

To make a long story short, I ended up downloading and installing the RPM directly with:

# wget http://mirror.centos.org/centos/5/os/i386/CentOS/perl-URI-1.35-3.noarch.rpm
# rpm install perl-URI*

 

Download and Install s3fs

Once I had subversion installed, I checked out and built s3fs:

# svn checkout http://s3fs.googlecode.com/svn/trunk/ s3fs-read-only
# cd s3fs-read-only/s3fs
# make install

 

There are a handful of warnings from the compiler, but I ignored them since I wasn’t particularly interested in working on the code.

Setting up Keys

You can invoke s3fs with your Amazon credentials on the command line, in the environment, or in a configuration file. Since command lines and environments are too easy for bad guys to find, I opted for the configuration file approach.

Create a file /etc/passwd-s3fs with a line containing a accessKeyId:secretAccessKey pair.

You can have more than one set of credentials (i.e., credentials for more than one amazon s3 account) in /etc/passwd-s3fs in which case you’ll have to specify -o accessKeyId=aaa on the command line.

Once that’s all set up, you can mount the S3 bucket mybucket at the mountpoint /mnt/mybucket, the command line is:

	# /usr/bin/s3fs mybucket /mnt/mybucket

Now, you can treat /mnt/mybucket as a regular copy destination including using it for rsync!

If you ever want to get rid of the mount, the normal unix umount command does the trick:

	# umount /mnt/mybucket

Enjoy!

Serving Mercurial over SSH

So…

I’ve been deploying our big Django application to all of our servers using a pretty slick setup with Fabric and rsync.

This worked fine when I was the only developer, working on my local machine and pushing to a Mercurial repository on one of our internal Ubuntu boxes. Since it was all local, I just used the server’s Apache setup and mod_wsgi and didn’t worry about security too much. The Linux box is completely firewalled off from the Intertubes.

However, as we get more people working on the code, and as we deploy to more servers, having the ability to update to and from our Mercurial repository is becoming more important. Just ask Jeff(rey), whose template changes I clobbered this afternoon.

Since the shared hosting on which I wanted to set up my Mercurial repositories doesn’t have mod_wsgi, nor is it really safe to put a ‘foreign’ module like that into a cPanel setup, I had to find another way to serve mercurial securely.

Also, since we’re finding that Apache and its threading model are consuming waaaaaaaayyyy too much memory under load, we’re moving toward lighter weight, single purpose servers for everything anyway.

So…I found mercurial-server.

It gives secure, tightly controlled Mercurial access over a simple SSH connection.

After I found the `.deb` file and was able to use `dpkg` to actually install it, things went pretty smoothly.

Setup instructions are pretty straightforward with the only part that confused me a little was extracting the key from SSH Agent with `ssh-add -L`. I wasn’t using SSH Agent so the directions didn’t work but once I figured out that al I needed in-hand was the public key, I was on my way.

The repositories are kept squirreled away in kind of an odd location that’s not mentioned anywhere, but that doesn’t seem to be much of an issue as long as that directory tree (`/var/lib/mercurial-server/repos/`) is getting backed up, I’m fine with wherever it wants to put it. The reason for that location, far as I can figure, is that the /var/ tree is supposed to be for things:

/var/ Variable files—files whose content is expected to continually change during normal operation of the system—such as logs, spool files, and temporary e-mail files. Sometimes a separate partition.

At any rate, it took a couple of hours to get set up right and, while speeds don’t seem to be quite as good as they were running under Apigche, it’ll work just fine for my automated setup.

Now, to figure out how to get hooks to do all the necessary pushing and pulling for me…

To see how to get this running on CentOS, see my next blog post. What a PITA!

How to Work On Multiple Twisted Branches at the Same Time

So…

I’m very interested in the new HTTP/1.1 functionality that’s pending review in Twisted in two different branches.

They’re all about new Twisted Web Client functionality and HTTP/1.1, and are documented in TwistedWebClient.

Don’t download them, but they are:

expressive-http-client-886-4

and:

high-level-web-client-3987

The main Twisted version control repository is currently Subversion which isn’t particularly good at merging things or, at least, I’ve never been very happy with the way it works.

So, I asked on the #twisted IRC channel how one would go about working on those two branches simultaneously.

The following conversation ensued:

ssteinerX: Does anyone know how to checkout twisted-branch-expressive-http-client-886-4 and twisted-branch-high-level-web-client-3987 in such a way as they can be used together?
ssteinerX: 3987 depends on the stuff in 886-4 but they’re in completely separate branches
ssteinerX: I forget (thankfully) how to even use svn other than a simple checkout…
ivan: I use git and merge everything into my personal branch
ssteinerX: ivan: have you merged those two particular branches successfully?
ivan: yes
ssteinerX: Is it something you could possibly post (or have posted) to github?
ivan: my git svn mirror of unmodified Twisted code is at ludios.net
ivan: git svn fetch, make a branch, merge 886-4, merge 3987

So…taking that information, here’s how to make a local branch that contains everything from those two branches together:

First, get whatever is the current “Twisted-svn-git-to…” archive from ludios.net.

I use wget like:

	# wget http://ludios.net/mirror/Twisted-whatever-the-heck

Unarchive that, then change to the directory and update it with:

	# git svn fetch

That will pull the latest changes from the svn server right into your git repository.

Then, make a branch of your own to work on. I called mine 886-4+3987 since that’s what it is…

	# git branch 886-4+3987
	# git checkout 886-4+3987
	# git merge expressive-http-client-886-4
  	# git merge high-level-web-client-3987

And there you have it. Everything in both branches, working in one checkout.

You can make a new virtualenv, install this branch to it with the normal `python setup.py develop` and go about your business with your new Twisted!

Create New BitBucket Private Repository

So, I’ve finally bitten the proverbial bullet (or was that bucket ?) and am now an official, paying BitBucket subscriber. Yay me, Yay BitBucket.

So…naturally, the first thing I want to do is put a couple of my locally hosted (and, therefore not well off-site-backed-up) repositories up.

Here’s how to move local repositories to BitBucket, step-by-step

Before we get started:

MAKE A BACKUP OF EVERYTHING.

Some of the commands we’ll use will destroy version control files, entire directories, and could wipe out all of your work.

If you have a backup, you can just restore from backup, mark me as “idiot” in your contact manager, and that will be that.

In any case, by reading further, you are agreeing that you have a backup, that you know how restore from it if something goes wrong for any reason, and that you agree not to blame me or try to make me responsible for anything that goes wrong.

So, here’s how you do it (mostly for my future reference):

We’re going to use the repository name `stuff` and the directory that `stuff` currently resides in will be `~/stuff` i.e. `stuff` in your home directory.

Create the repo on BitBucket by going to http://bitbucket.org/account, click Repositories on the top menu bar and pick “Create New” from the menu.

Fill in the simple form and press “Create Repository.”

You will be brought to the home page for that repository.

Change to the directory above the directory containing the stuff you want to put into the repository.

Remember, we’re assuming your `stuff` project is contained in `~/stuff`, so change to your home directory (NOTE: a simple `cd` will do it on *nix).

Change to your home directory and move the current set of files aside:

# cd ~
# mv stuff stuff.sav

Now, I prefer to use ssh for transferring anything anywhere, ever, so I’ve got my system setup for public key access to BitBucket. If you use the http: method of accessing your repositories, you’ll have to modify the instructions below to just use the http equivalents of the commands I use below.

Copy the clone command from the BitBucket web site, in this case it was:

 hg clone ssh://hg@bitbucket.org/ssteiner/stuff/

You should get something like:

(~/)# hg clone ssh://hg@bitbucket.org/ssteiner/stuff/
destination directory: stuff
no changes found
updating working directory
0 files updated, 0 files merged, 0 files removed, 0 files unresolved

You now have `~/stuff.sav` containing your original code and `~/stuff` containing the checkout of nothing.

So, now, copy the actual stuff into the repository:

# cp -r stuff.sav/* stuff/

And change to the repository directory:

# cd ~/stuff

Now, check Mercurial’s status to see that it doesn’t know anything about anything:

# hg st

You should see all your files listed since nothing’s tracked by Mercurial.

WARNING!

If you were previously tracking any of the components of the `stuff` project with another version control system, you may have a whole collection of files related to those systems.

In the project I was using for this article, I had remnants of Subversion and Git, adding many, many wasted bytes to the checkin.

Fortunately, I found it before I checked it into BitBucket.

I got rid of all that cruft with:

# cd ~/stuff
# find . -name ".svn" -exec rm -rf {} \; -print
# find . -name .git -exec rm -rf {} \; -print

WARNING!

The two commands above can remove a lot of stuff.

If you run it in the wrong directory, it can remove all of your version control files from every directory in your system.

Please be careful and understand what you’re doing, and see the disclaimer at the beginning of this page.

Put it all under Mercurial

Put everything under version control with:

# hg add *

Commit it:

# hg commit -m "First commit of stuff project"

Push it up to BitBucket:

# hg push

Global .hgignore

Now that I’ve pretty much switched to Mercurial (though I don’t have a great server setup yet), I wanted to simplify my .hgignore files.

What I wanted was something like a ~/.hgignore file in my home directory to exclude “the usual” so I could make the local .hgignore be specific to the current project.

I found several references that said to use:

[ui]
ignore = ~/.hgignore

But that didn’t work.

The solution I found, after much Googling, was to modify the project’s .hg/hgrc with:

[ui]
ignore = ~/.hgignore

Couldn’t find that in the Mercurial documentation anywhere and not sure how to get new repositories to automatically include that chunk in their .hg/hgrc but the .hg/hgrc file doesn’t seem to exist by default.

Anyone with more info, please feel free to comment, I’d love to have my .hg/hgrc generated automatically on hg init with that and any other things that might be useful.

Comments, please?

Thanks,

S

Snow Leopard vs. virtualenv – easy_install virtualenv==dev != latest

So there I was, merrily plooking along with my various Python projects and had occasion to make a new virtual environment using `mkvirtualenv` from Doug Hellmann’s excellent virtualenvwrapper.

And it hung.

I ctrl-C’d out after a few minutes and tried again. Hung.

Figured it might be a Snow Leopard thing, so I did a quick:

	# easy_install virtualenv==dev

Figuring that’d get me the latest version.

Same thing.

Poked around in the source looking for a clue for a minute, then did the obvious; Googled for the error message.

Which lead me to this post.

Turns out that easy_install grabs from a subversion repository that’s not quite up to date with the new code up on bitbucket.

To quote that post:

Turns out that triggers an install from the Subversion repository at colorstudy.com which *doesn’t* have the Snow Leopard fix, but is also labeled as version 1.3.4dev. So I guess I was chasing my tail a bit.
I should have done this:

> easy_install http://bitbucket.org/ianb/virtualenv/get/tip.zip

That gets the virtualenv with fix I was after, and indeed does work.

So, the lesson is: in this time of projects moving off of their own little subversion repositories and onto bitbucket and github, and easy_install, out of the box, supporting only subversion and CVS (which I won’t dignify with a link), check your assumptions about which version of what you’ve got installed; sometimes things LIE!

Hopefully, this will save someone else some time and trouble.

P.S.
Speaking of subversion, I’m hoping to get to use the setuptools Mercurial plugin working sometime soon since most of my new projects are on Mercurial, but I’ll probably wait until I convert over to Distribute which may get it built in sooner rather than later.

Mercurial Tags Are Handled Oddly

Mercurial tags are a little odd.

The canonical reference.

The odd thing is that the tag is not included in a checkout of the tag.

I don’t seem to remember any other system behaving this way.

Usually, When you check out a tagged revision, the tag is included in the checkout.

Apparently, in Mercurial (from the wiki):

Common wisdom says that to avoid the confusion of a disappearing tag, you should clone the entire repo and then update the working directory to the tag. Thus preserving the tag in the repo.

Ok, then, that doesn’t make any flippin’ sense at all.

That’s all for now!

Converting from Git to Mercurial

So, as per my previous post, we’re going to be using Mercurial going forward.

I have several projects in Git already and was doing a conversion to Mercurial.

I did the simple:

	# rm -rf .git
	# mv .gitignore .hgignore
	# hg init
	# hg status

Unfortunately, this gave me:

	abort: /super-secret/.hgignore: invalid pattern (relre): *~

WTF?

Turns out, that for Mercurial, you have tell it the format of your .hgignore file.

Inserting:

syntax: glob

Fixed it right up and I’m off and running.

Into the breach!

Stupid Subversion on Stupid CentOS

After using Ubuntu for my day to day server for a while, I’m really realizing how much CentOS and yum suck.

oot@tequila [~]# yum -y install subversion
Loading “fastestmirror” plugin
Loading mirror speeds from cached hostfile
* base: pubmirrors.reflected.net
* updates: mirrors.gigenet.com
* addons: yum.singlehop.com
* extras: mirror.steadfast.net
Excluding Packages in global exclude list
Finished
Setting up Install Process
Parsing package install arguments
Resolving Dependencies

[...]

–> Finished Dependency Resolution
Error: Missing Dependency: perl(URI) >= 1.17 is needed by package subversion

Stupid miserable POS!

Quick Solution:

wget http://rpm.evopanel.net/rpms/perl-URI-1.35-3.noarch.rpm

rpm -ihv perl-URI-1.35-3.noarch.rpm

yum -y install subversion

Done.

The WSSW Stack

Choosing The Stack

Ok, so I’ve been plooking around with various web frameworks, even languages, for a couple of years now.

Now, while starting WebSauce Software for real, it’s time to choose a standard toolset. This is what we are going to use to produce our software until further notice.

Unless there’s a compelling reason to change, this is what we’re using.

If something great comes along to replace a component then fine, but it’ll have to be pretty damn good for us to switch.

If it’s great, we’ll switch.

Adapt or die!

First a little history.

We got into the web business about 7 years ago after 25 years of general purpose contract programming which overlapped with about 10 years of software publishing.

I started consulting in about 1982, started publishing software in about 1986, stopped publishing software in 1994, and retired from the software business, sort of, in 1995, had my first son in 2002, and went back to work in 2004-ish.

I did some consulting between 1995 and 2004, but only a handful of really complex, challenging jobs. I was not making a living, I was just taking on work I liked and wanted to do.

When I went back to work, I didn’t know exactly what type of work would be coming up and I wasn’t too worried about it. I’ve always managed to keep busy.

Unfortunately, I had been out of the loop for almost 10 years so most of my old consulting contract clients were gone, companies changed hands, engineers at those companies moved around to parts unknown. In short, I didn’t really have any contacts any more.

So, I rented an office and hung my shingle out to see what would happen.

People kept asking me if we did websites.

So, I said we did.

Now, it’s not that we hadn’t done websites before that for ourselves or for customers, but we weren’t in the business of making websites for other people.

So, now we were, and we did.

Lots of them.

We grew, hired people, had clients, had a stream of new clients, a few big clients, I wrote some nice tools for in-house use that made us more efficient than other companies, we learned the web development business and everything was hunky-dunky.

Except…

I hate making new websites for people who don’t already have them.

They have unrealistic expectations of what the site can do for them and especially, how much it should cost. At least people with existing sites have an idea what things cost, and know what the site is doing or not doing for them.

Improving an existing site is way better, for us. Less friction, better
results all’round.

What I do like…

Fixing existing sites

Fixing up an existing site is a blast. We get to leverage all of our cool tools and, because of those tools, we’re very efficient at it. Because of our efficiency, clients get a better deal that they did from their prior company which makes us look good and, since almost everything is automated, we make good profit margins.

Best part? I get paid to spend time ploinking on the tools we use to do customer jobs more efficiently which is the most fun for me.

Doing SEO

Getting sites to rank well in the Search Engines, making sure that their customers can do useful things with their website, and generally helping our customers serve their customers better.

My software engineering background has allowed me to write some tools that do things in this area that nobody else has. We’ll be publishing some of them soon. We’ll let you know ;-) .

Writing web applications

Things that are kind of like desktop applications but run in a browser and do things that are appropriately web based. We’ve done SalesForce.com integration, custom database editing applications for real estate brokers, inventory control and management against existing, legacy databases that just need a new view to be more useful than they already are, all kinds of stuff. Love it.

How I’ve Written All This Stuff

I’ve written utilities for doing the repetitive parts of SEO and also written web applications for various purposes for clients and for in-house needs.

I was always hunting for the best development toolset both for client applications and for our own internal tools.

I’ve gone through a lot of tools.

So I tried…in no particular order

and God knows how many other frameworks, version control systems, WSGI components, templating languages, and chunks and parts of various solutions.

So…I’ve finally settled

So, after all that trial and error, here’s my toolset.

This is what I’m using from now on unless there’s a compelling reason to use something else. Most of the bigger tools (Django, for example) have or are developing plug-in parts for things like the templating system so these choices are not as rigid as having this list might imply.

Linux Distribution: Ubuntu

I’ve used just about every Linux distribution at one time or another, we host lots of sites on the Centos series, I think one of our in-house boxes is Suse. Then I started using Ubuntu since it seemed to be the one most of the documentation for the tools I was using was written for. I figured there must be some reason for that since it was just too pervasive to be a coincidence. Not a coincidence. It just works better. All of our cloud servers are now fired up with Ubuntu 9.04 server configuration and I run Kubuntu (I absolutely hate Gnome, love KDE). I’m envious of the MacOS-X Aqua theme, only for Gnome so far, but it’s not enough of a reason to switch to Gnome.

Ubuntu has been rock solid, and apt-get blows away any other system package tool I’ve used (yum, nasty RPMs, etc.).

Language: Python

The language I always come back to. I’ve tried other languages. Seems like I’ve tried every other language at one time or another. Last time I counted it was, like, 40 or something including dialects of Basic, Pascal, C, C++, Delphi, various Assembly languages, Perl, Ruby, Awk, SmallTalk, Lisp, Sed, Haskell, and many, many others I can’t even remember. I don’t remember who said it but Python really is executable pseudo code

VCS: Mercurial (hg)

Up until a few months ago, we were Subversion users. I feel dirty even saying it, now. We used Perforce for one job but I hated it the whole time. I always found Subversion annoying; especially trying to merge branches.

The centralized repository always gave me an uncomfortable feeling I never identified until I started using Git on an Open Source project I was working on.

The first time I did a merge, I was hooked. It was painless and it wasn’t a trivial merge either. I had to manually resolve one conflict out of 30 or so changes. It took five minutes. It would have taken all day in Subversion and I would have been swearing the whole time. I was leaning toward Git, not having used any of the other likely suspects much until this announcement.

Then, there’s Google’s support which double sealed the deal.

Since the main Python repository is going to be Mercurial, and since that will likely drive adoption on other projects that have yet to move out of Subversion, and since Mercurial is written in Python, it would be silly to use anything else since there’s really little obviously superior about any other DVCS.

Mercurial is also sure to get lots of loving attention and will pass Git in short order in any area where it’s currently lagging. Fortunately, Git, Mercurial, and Bazaar are similar enough that it’ll be easy enough to switch around when needed.

WebSauce’s projects will all be DVCS’d in Mercurial and I’ll document the setup as soon as I get around to it. The setup, that is…

Desktop App Development: Cocoa/Objective-C

I tried writing my first OS X Application for publication using Python and PyObjc. I had a working prototype but, even with expert help, couldn’t get it to run anywhere but my development system. Next app is pure Objective-C and, if I need Python for something, I’ll run it as an external process and work on getting the results back some way other than being running inside the main application space.

Web Framework: Django

I may not like some of the parts of the Django stack so much but it all hangs together well and, if I get sufficiently dissatisfied with any particular part, I’m sure there will be a way to “fix” it on my own checkout and submit a patch. I’m pretty sure most of the Django pieces are pluggable to some extent and, where they’re not, it would be good of me to help make them so. That’s what Open Source is all about, right?

Web ToolKit: Twisted

Twisted does so many things, and our applications need so many of them, that it’d be silly not to use the grandfather of all things Python and Web.

Sure, it’s a little hard to wrap your head around in the beginning, and there are parts that are dark, deep, and mysterious, but I’ve been hanging around on the mailing list and IRC channel and I’m confident that if I run into a problem, and do my research before asking for help, that I’ll be able to get any problem solved in relatively short order.

Because so much of the rest of our apps require Twisted services, we’re going to run our Django app using Twisted’s WSGI unless we run into problems, Then we’ll fall back to eiter CherryPy’s WSGI, Apache’s mod_python, Apache’s mod_wsgi. Whatever, not a big deal.

Documentation Language: Restructured Text

The documentation format of Python that can be easily converted to everything else.

It’s human-readable in source form, intuitive, and is everywhere in all the tools I use.

No brainer.

Other Tools

The stack really isn’t worth anything unless you can deploy it.

For that, I’m relying on several other Python based tools:

Paste

I’m only using the directory template creation of Paste. Paste is for the most part, overgrown and under-focused but the directory templating works well enough for now.

virtualenv, virtualenvwrapper

These allow me to set up an isolated Python environment in which to run my applications. Keeps all the cruft out of the system and gives me an attainable target to deploy.

zc.buildout

Allows creation of a completely self-contained app. Virtualenv’s great for development, but this wraps it all up in a one-stop-shopping bundle.

fabric

Makes deployment as simple as writing a Python script that does what you want to distribute an application to wherever you want to deploy it.

Sphinx

The documentation tool used on the Python project itself. You can set up a documentation structure in one command, write your docs in reStructuredText, and have it in html, latex, and several other formats in a flash.

github/BitBucket/LaunchPad

Not really part of the deployment stack but from having worked on several open source projects on github with git, I think it’s about the best there is right now. I’m still interested in looking at BitBucket and I’m contributing to a few projects there as well but Github seems to be more mature and has a much more informative and useful interface. LaunchPad is very ambitious, and seems well thought out and pretty all-encompasing. Unfortunately, the only backend it supports is Bazaar. Yuck.

Basecamp

We’ve been using Basecamp for a while now for project management. It’s not perfect but it is the best shared system we’ve found. We’ve tried Google Docs and got addicted to shared documents but the rest of the system doesn’t provide any project management functionality so things tended to get lost in there since there was no way to indicate what was to be done next. Basecamp also has shared documents (Writeboards) and also ToDo Lists and Milestones which make it possible to keep a project moving.

FogBugz

We’re currently using FogBugz to track our bugs in the OS X product that we’re untangling the Python code from and it really is a great bug tracking system.

We’ve been focused mostly in BaseCamp so it will be interesting to see how well they integrate or whether we move to another system for this functionality. An obvious choice would be Trac and, with the buildout script, maybe it won’t be so abominable to install.

For now, that’s it.

I’ll be updating this as I update the toolset but this is it, for now…