Mixpanel Engineering

Real-time scaling

Archive for the ‘Backend’ Category

Sharding techniques

with 5 comments

At Mixpanel, we process billions of API transactions each month and that number can sometimes increase rapidly just in the course of a day. It’s not uncommon for us to see 100 req/s spikes when new customers decide to integrate. Thinking of ways to distribute data intelligently is pivotal in our ability to remain real-time.

I am going to discuss several techniques that allow people to horizontally distribute data. We have conducted interviews (by the way, we’re hiring engineers) with people in the past that make poor decisions in partitioning (e.g. partitioning by the first letter in a user’s name) and I think we can spread some knowledge around. Hopefully, you’ll learn something new.

Read the rest of this entry »

Written by Suhail

May 11th, 2011 at 4:29 pm

Posted in Backend

gevent: the Good, the Bad, the Ugly

with 14 comments

I’m not going to spend much time describing what gevent is. I think the one sentence overview from its web site does a better job than I could:

gevent is a coroutine-based Python networking library that uses greenlet to provide a high-level synchronous API on top of libevent event loop.

What follows are my experiences using gevent for an internal project here at Mixpanel. I even whipped up some performance numbers specifically for this post!

Read the rest of this entry »

Written by Avery Fay

October 29th, 2010 at 11:07 am

Posted in Backend,python