Archive for the ‘Backend’ Category
Sharding techniques
At Mixpanel, we process billions of API transactions each month and that number can sometimes increase rapidly just in the course of a day. It’s not uncommon for us to see 100 req/s spikes when new customers decide to integrate. Thinking of ways to distribute data intelligently is pivotal in our ability to remain real-time.
I am going to discuss several techniques that allow people to horizontally distribute data. We have conducted interviews (by the way, we’re hiring engineers) with people in the past that make poor decisions in partitioning (e.g. partitioning by the first letter in a user’s name) and I think we can spread some knowledge around. Hopefully, you’ll learn something new.
gevent: the Good, the Bad, the Ugly
I’m not going to spend much time describing what gevent is. I think the one sentence overview from its web site does a better job than I could:
gevent is a coroutine-based Python networking library that uses greenlet to provide a high-level synchronous API on top of libevent event loop.
What follows are my experiences using gevent for an internal project here at Mixpanel. I even whipped up some performance numbers specifically for this post!