Mixpanel Engineering

Real-time scaling

Easy Python Deployment with Fabric and Git

with 8 comments

If you're interested in what we work on, please apply - we're hiring: http://mixpanel.com/jobs/

When I want to deploy code to http://mixpanel.com, I open a new terminal window and type fab deploy. Even though we have quite a few servers these days, our deploy process is really streamlined – we push code multiple times per day.

When you’re first starting a new web project, deployment is easy. All you have to do is log in to your server and do a git pull, and probably restart your web server. No problem.

If you grow beyond a single machine, though, this technique is rife with problems: the time it takes to deploy code grows linearly with the number of servers you have, it’s difficult to synchronize deployment, and it’s simply error-prone. Any point in your deployment process that requires you to log in to a server and type multiple commands is just asking for trouble.

We use a tool called Fabric to automate this process. Fabric makes it really easy to run commands across sets of machines. It’s similar to Capistrano (Ruby) but it’s written in Python so it was an easy choice for us

How Fabric Works

Fabric is a command-line tool that you can install with easy_install.

sudo easy_install fabric

Once you get it installed, you can get started by creating a fabfile.py in the root of your application directory.

Here’s a bare-bones fabfile:

from fabric.api import env, roles, run

# Define sets of servers as roles
env.roledefs = {
    'web': ['123.123.123.123'],'
    'cache': ['124.124.124.124', '125.125.125.125']
}

# Set the user to use for ssh
env.user = 'demo'

# Restrict the function to the 'web' role
@roles('web')
def get_version():
    run('cat /etc/issue')

The really cool part is the way you can define roles, and only run an action on the servers tied to each role. Here we are defining a ‘web’ role with a single IP, and when we call fab get_version it will only run on that server. We’re also defining a ‘cache’ role that is tied to two IP’s, so a function acting on that role would run first on one IP and then on the other.

Any function you write in your fab script can be assigned to one or more roles. You can also include a single server in multiple roles – for example, we have an ‘all’ role that applies a function to every server.

Fabric functions

Fabric provides a set of commands in fabric.api that are simple but powerful:

local() – run a local command
run() – run a command on the remote server, user-level permissions
sudo() – sudo a command on the remote server
put() – upload a file to the remote server
get() – download a file from the remote server

You can combine these commands to to just about anything.  We use fabric for a whole bunch of stuff, including:

  • Deploying live
  • Deploying stage
  • Deploying modified config files
  • Restarting servers
  • Updating iptables rules
  • Stopping and starting processing

Basically, any time we would need to log in to multiple servers to do something, we turn to Fabric. It’s simple and powerful.

Deploying with Fabric

We still use Git to deploy, but Fabric makes it easy to automate. Here’s a watered-down version of our deploy function:

# Restrict the function to the 'web' role
@roles('web')
def deploy():
    path = '/path/to/your/code'
    run('cd %(path)s; git checkout master' % {'path' : path})
    run('cd %(path)s; git pull' % {'path' : path})

    sudo('cd %(path)s; ../fastcgi stop' % {'path' : path})
    sudo('cd %(path)s; ../fastcgi start' % {'path' : path})

This should be pretty readable, but basically we log in to every server with the ‘web’ role, pull from master, and restart fastcgi. It’s super fast and a lot less error prone than manually logging in and doing all of that.

Minor annoyances

As far as I can tell, Fabric can only execute commands sequentially rather than in parallel – but it looks like this is changing soon: http://code.fabfile.org/issues/show/19

Running simple commands on many servers is an embarassingly parallel problem so it’s good to see some progress is being made here.

It is also unfortunately difficult to aggregate functions that act on different roles. For example, using this fabfile:

@roles('web')
def foo():
    ...

@roles('api')
def bar()
    ...

def all():
    foo()
    bar()

fab all doesn’t actually work. This is aggravating because it’s a common task for us – we want to call a single deploy function and have it deploy certain things to the app server, others to the api server, etc.

These are definitely minor issues – Fabric is still incredibly valuable for us – I just thought it was important to point out that it’s not all rainbows and unicorns in Fabric-land.

If you’d like to learn more about Fabric, check out fabfile.org or the official IRC channel, #fabric on freenode.

If you're interested in what we work on, please apply - we're hiring: http://mixpanel.com/jobs/

Written by Tim Trefren

September 9th, 2010 at 8:00 am

Posted in Operations

Tagged with , , ,

8 Responses to 'Easy Python Deployment with Fabric and Git'

Subscribe to comments with RSS or TrackBack to 'Easy Python Deployment with Fabric and Git'.

  1. Just an FYI, but newer version of Fabric support this syntax:

    with cd(‘/srv/django/my_project/’):
    run(‘git checkout master’)

    Which makes the whole cd’ing aspect a little easier to follow.

    Frank Wiles

    9 Sep 10 at 1:05 pm

  2. Thanks Frank – that’s interesting for sure.

    Tim Trefren

    14 Sep 10 at 9:27 am

  3. [...] In the real world, the network isn’t static. We’re adding new machines all the time, and if we don’t update iptables at the same time, the new machines won’t be able to communicate with the old ones. To solve this problem, I dynamically generate iptables files and deploy them with Fabric. [...]

  4. def all() can work if you set up your contexts.
    Example: http://dpaste.com/hold/259960/

    It works for now and I believe they will have a patch for this in the future.

    glenbot

    18 Oct 10 at 8:24 pm

  5. Hi Tim,

    Thanks for the quick writeup on fabric. I have a couple of questions:
    1. How do you deal with dependencies on external packages for eg. new python libraries that need to be installed?
    2. How do you do automated database migrations? You will still need to write migration scripts for changing db schemas for example..

    Arnav

    5 Dec 10 at 5:02 am

  6. Thank you for this article! Up to now we use SVN and unison to sync our dev/test/prod servers. I think fabric and git will be used by us in the future.

    Thomas Güttler

    2 Feb 11 at 1:18 am

  7. Farbic + Puppet (in standalone mode) = great deployment duo.

    Jens

    2 May 11 at 8:53 am

  8. Thought I’d update here that Fabric 1.3+ now has parallel execution. Let us know if you find it useful and/or and improvements.

    Morgan Goose

    25 Oct 11 at 3:16 pm

Leave a Reply