Mixpanel Engineering

Real-time scaling

Internship stories

with one comment

Last year, I wrote about my internship story because I felt it was such an impactful experience for me. It was simply a story of how working hard and being out in Silicon Valley can lead to very serendipitous occurrences. I don’t think I could have built Mixpanel without the knowledge and connections I gained at Slide. I learned so much about product, how to “get things done” at a real company, and met really close friends that I will take with me forever in life. I was also fortunate enough to work closely with Max, who has been an invaluable mentor and investor for our business.

The point of that post, of course, was to find ourselves interns. We wanted to get a lot of work done, but we also genuinely wanted to give them an extremely meaningful experience like my own. We’d publicly promised them one, so we set out to make good on it. At the end of the summer I asked them to write about what it was like to intern at Mixpanel. I hope those of you that are considering interning at a startup vs. a big company will benefit.

Read the rest of this entry »

Written by Suhail

November 15th, 2011 at 12:31 pm

Posted in Uncategorized

Why We Moved Off The Cloud

with 55 comments

This post is a follow up to We’re moving. Goodbye Rackspace.

Cloud computing is often positioned as a solution to scalability problems. In fact, it seems like almost every day I read a blog post about a company moving infrastructure to the cloud. At Mixpanel, we did the opposite. I’m writing this post to explain why and maybe even encourage some other startups to consider the alternative.

Read the rest of this entry »

Written by mixpanel

October 27th, 2011 at 12:34 pm

Posted in Uncategorized

How and Why We Switched from Erlang to Python

with 28 comments

A core component of Mixpanel is the server that sits at http://api.mixpanel.com. This server is the entry point for all data that comes into the system – it’s hit every time an event is sent from a browser, phone, or backend server. Since it handles traffic from all of our customers’ customers, it must manage thousands of requests per second, reliably. It implements an interface we’ve spec’d out here, and essentially decodes the requests, cleans them up, and then puts them on a queue for further processing.

Because of these performance requirements, we originally wrote the server in Erlang (with MochiWeb) two years ago. After two years of iteration, the code has become difficult to maintain.  No one on our team is an Erlang expert, and we have had trouble debugging downtime and performance problems. So, we decided to rewrite it in Python, the de-facto language at Mixpanel.

Given how crucial this service is to our product, you can imagine my surprise when I found out that this would be my first project as an intern on the backend team. I really enjoy working on scaling problems, and the cool thing about a startup like Mixpanel is that I got to dive into one immediately. Our backend architecture is modular, so as long my service implemented the specification, I didn’t have to worry about ramping up on other Mixpanel infrastructure.

Read the rest of this entry »

Written by mixpanel

August 5th, 2011 at 5:37 pm

Posted in Uncategorized

My first week at Mixpanel, or how I didn’t take down the Internet

with one comment

During my first week at Mixpanel I was asked to design, implement and deploy a highly requested feature in our core javascript library.  I had just started as the new intern and I hit the ground running.  Our customers wanted a simple method to track link clicks without having to hassle with browser incompatibilities or fiddle with event models.  The new functionality would also lay the groundwork for future enhancements such as form integration.  I got to work right away.

Read the rest of this entry »

Written by Carl Sverre

May 23rd, 2011 at 10:54 am

Posted in Frontend

Tagged with , ,

Sharding techniques

with 5 comments

At Mixpanel, we process billions of API transactions each month and that number can sometimes increase rapidly just in the course of a day. It’s not uncommon for us to see 100 req/s spikes when new customers decide to integrate. Thinking of ways to distribute data intelligently is pivotal in our ability to remain real-time.

I am going to discuss several techniques that allow people to horizontally distribute data. We have conducted interviews (by the way, we’re hiring engineers) with people in the past that make poor decisions in partitioning (e.g. partitioning by the first letter in a user’s name) and I think we can spread some knowledge around. Hopefully, you’ll learn something new.

Read the rest of this entry »

Written by Suhail

May 11th, 2011 at 4:29 pm

Posted in Backend

We’re moving. Goodbye Rackspace.

with 76 comments

At Mixpanel, where our hardware is and the platform we use to help us scale has become increasingly important. Unfortunately (or fortunately) our data processing doesn’t always scale linearly. When we get a brand new customer sometimes we have to scale by a step function; this has been a problem in the past but we’ve gotten better at this.

So what’s the short of it? We’re unhappy with the Rackspace Cloud and love what we’re seeing at Amazon.

Over the history we’ve used quite a few “cloud” offerings. First was Slicehost back when everything was on a single 256MB instance (yeah, that didn’t scale). Second was Linode because it was cheaper (money mattered to me at that point). Lastly, we moved over to the Rackspace Cloud because they cut a deal with YCombinator (one of the many benefits of being part of YC). Even with all the lock in we have with Rackspace (we have 50+ boxes and hiring if you want to help us move them!), it’s really not about the money but about the features and the product offering, here’s why we’re moving:

Read the rest of this entry »

Written by Suhail

November 8th, 2010 at 4:03 pm

Posted in Uncategorized

gevent: the Good, the Bad, the Ugly

with 14 comments

I’m not going to spend much time describing what gevent is. I think the one sentence overview from its web site does a better job than I could:

gevent is a coroutine-based Python networking library that uses greenlet to provide a high-level synchronous API on top of libevent event loop.

What follows are my experiences using gevent for an internal project here at Mixpanel. I even whipped up some performance numbers specifically for this post!

Read the rest of this entry »

Written by Avery Fay

October 29th, 2010 at 11:07 am

Posted in Backend,python

Building C extensions in Python

with 9 comments

At Mixpanel performance is particularly important to us and as we begin to scale our data volume to support billions of actions. We’ve found ourselves thinking about how to solve problems better.

We’re currently writing a feature that is going require considerable scale and performance but in order to do it we had to think about how to do it in a time for our users to be happy. Unfortunately, Python is too slow for some types of operations we wish to do where we can get an order of a magnitude of performance out of something lower level like C.

So imagine: You want to stick to Python because it’s so fast to develop in but need the performance of C/C++. Let me introduce you to C extensions in Python.

If you’ve ever used something like cJSON in the past, then you’ve already installed something like this before–it’s likely a lot modules you import in Python are built in C and not just pure-python.

Read the rest of this entry »

Written by Suhail

September 30th, 2010 at 7:15 pm

Posted in Uncategorized

Best Javascript Charting Libraries

with 5 comments

When we started Mixpanel, we used amCharts, a pretty full-featured Flash-based charting library. This wasn’t ideal though – it’s closed-source and, well, it’s Flash. We ultimately switched over to pure Javascript charts and it was a great decision.

Now if something wonky happens, I can easily modify the library code. We also get the added benefit of broader platform support – you can use mixpanel.com on your mobile device and it works perfectly.

Actually picking the library was a little tricky. We were lucky – highcharts was released right when we started looking and it has performed admirably. There are a few other good choices though, and I will go into all of them in some depth.

Read the rest of this entry »

Written by Tim Trefren

September 17th, 2010 at 10:29 am

Posted in Frontend

Tagged with ,

Automating your firewall with Django and Fabric

with one comment

In my previous post covering OpenVPN, I said that we needed to restrict access to most of our servers – they will only be accessible to each other, rather than open to the outside world.

How do we do this? iptables. You can add iptables rules that explicitly state the ip addresses that are allowed through the firewall, and then disallow everything else.

If our network was static – meaning we would never have to add more machines – then this would be really simple. All you’d need to do is update your iptables file once with the ip of every server you own, and you’re done. No worries.

In the real world, the network isn’t static. We’re adding new machines all the time, and if we don’t update iptables at the same time, the new machines won’t be able to communicate with the old ones. To solve this problem, I dynamically generate iptables files and deploy them with Fabric.

Note: all code mentioned in this post can be found on github here: http://github.com/ttrefren/firewall

Read the rest of this entry »

Written by Tim Trefren

September 14th, 2010 at 1:48 pm

Posted in Operations

Tagged with , , ,