Debugging MySQL performance at scale

If you're interested in what we work on, please apply - we're hiring: http://mixpanel.com/jobs/

On Monday we shipped distinct_id aliasing, a service that makes it possible for our customers to link multiple unique identifiers to the same person. It’s running smoothly now, but we ran into some interesting performance problems during development. I’ve been fairly liberal with my keywords; hopefully this will show up in Google if you encounter the same problem.

The operation we’re doing is conceptually simple: for each event we receive, we make a single MySQL SELECT query to see if the distinct_id is an alias for another ID. If it is, we replace it. This means we get the benefits of multiple IDs without having to change our sharding scheme or moving data between machines.

A single SELECT would not normally be a big deal – but we’re doing a lot more of them than most people. Combined, our customers have many millions of end users, and they send Mixpanel events whenever those users do stuff. We did a little back-of-the-envelope math and determined that we would have to handle at least 50,000 queries per second right out of the gate.

We started with two beefy machines (128GB RAM, 32 cores) configured as master/master. Here’s the configuration file we had before going into production:

——

——

In initial testing, everything seemed fine. So, we pushed the new service into production – it would test load on the db’s, but wouldn’t actually do anything else yet. Things looked okay for the first few minutes.

Then, processing slowed to a crawl and a queue started building up. A quick glance at the db machine showed 100% utilization of all 32 cores. We pulled the service out of production and started debugging.

First we had to replicate the issue. It only showed up under heavy load, so I wrote a few scripts: one to configure a load test machine, one to start some load processes, and one to stop them. Then I spun up 24 cloud instances and was easily able to start & stop the load test.

After experimenting a bit, we were able able to consistently reproduce the issue on the live db. Things would be going fine, and then fall off a cliff – suddenly all cores would hit 100% utilization, and QPS would drop dramatically.


Queries per second
-------------------
Wed Oct 24 22:27:19 UTC 2012: 18660
Wed Oct 24 22:27:20 UTC 2012: 70713
Wed Oct 24 22:27:21 UTC 2012: 95458
Wed Oct 24 22:27:22 UTC 2012: 102481
Wed Oct 24 22:27:23 UTC 2012: 100699
Wed Oct 24 22:27:24 UTC 2012: 94557
Wed Oct 24 22:27:25 UTC 2012: 81161
Wed Oct 24 22:27:26 UTC 2012: 67551
Wed Oct 24 22:27:27 UTC 2012: 56643
Wed Oct 24 22:27:28 UTC 2012: 45078
Wed Oct 24 22:27:29 UTC 2012: 37634
Wed Oct 24 22:27:30 UTC 2012: 27729
Wed Oct 24 22:27:31 UTC 2012: 23331
Wed Oct 24 22:27:32 UTC 2012: 17784
Wed Oct 24 22:27:33 UTC 2012: 14870
Wed Oct 24 22:27:34 UTC 2012: 15449
Wed Oct 24 22:27:35 UTC 2012: 12695
Wed Oct 24 22:27:36 UTC 2012: 8152
Wed Oct 24 22:27:37 UTC 2012: 10105
Wed Oct 24 22:27:38 UTC 2012: 7825
Wed Oct 24 22:27:39 UTC 2012: 5479
Wed Oct 24 22:27:40 UTC 2012: 7515
Wed Oct 24 22:27:41 UTC 2012: 4186
Wed Oct 24 22:27:42 UTC 2012: 4182
Wed Oct 24 22:27:43 UTC 2012: 3704
Wed Oct 24 22:27:44 UTC 2012: 3876
Wed Oct 24 22:27:45 UTC 2012: 2535
Wed Oct 24 22:27:46 UTC 2012: 2733
Wed Oct 24 22:27:47 UTC 2012: 3044
Wed Oct 24 22:27:48 UTC 2012: 2347
Wed Oct 24 22:27:49 UTC 2012: 2094
Wed Oct 24 22:27:50 UTC 2012: 2460
Wed Oct 24 22:27:51 UTC 2012: 1641
Wed Oct 24 22:27:52 UTC 2012: 1901

This would only happen at a certain level of load – 21 machines going full-bore wouldn’t trigger it, but 22 would.

At this point, we still had no idea what was going on, but this was a pretty serious performance regression. So bad, in fact, that we thought it might be a bug in MySQL. So, we used strace to get a little better picture about what was going on by attaching to a single mysql child process.

strace output:
—————–
sched_yield() = 0
futex(0x7f901e08c5f0, FUTEX_WAIT_PRIVATE, 2, NULL) = 0
futex(0x7f901e08c5f0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f901e08c5f0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955446, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955449, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955456, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955458, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955459, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955463, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955468, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955470, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955471, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5955475, NULL) = 0
futex(0x7f901e0c9c60, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield() = 0
futex(0x7f901e08c5f0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5957951, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f901e0c9c9c, FUTEX_WAIT_PRIVATE, 5957953, NULL) = 0
futex(0x7f901e0c9c60, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f901e08c5f0, FUTEX_WAIT_PRIVATE, 2, NULL) = 0
futex(0x7f901e08c5f0, FUTEX_WAKE_PRIVATE, 1) = 1
sched_yield() = 0
...

Apparently, lots of lock contention. Here’s strace output during normal execution:


read(3429, 0x7f667f0e75c0, 4) = -1 EAGAIN (Resource temporarily unavailable)
fcntl(3429, F_SETFL, O_RDWR) = 0
read(3429, "W", 4) = 4
read(3429, "\3SELECT redacted FROM redacted"..., 87) = 87
fcntl(3429, F_SETFL, O_RDWR|O_NONBLOCK) = 0
setsockopt(3429, SOL_SOCKET, SO_RCVTIMEO, "\36", 16) = 0
select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
futex(0x7f6624fff8bc, FUTEX_WAIT_PRIVATE, 28981, NULL) = 0
futex(0x7f6624fff880, FUTEX_WAKE_PRIVATE, 1) = 0
sched_yield() = 0
futex(0x7f662439ac9c, FUTEX_WAIT_PRIVATE, 137651346, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f662439ac60, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f662439ac9c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x7f662439ac60, 137651386) = 3
futex(0x7f662439ac60, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7f6622f2d680, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f6622f2d680, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f6622f2d680, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7f6624fdcb3c, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x7f6624fdcb00, 32256) = 1
futex(0x7f6624fdcb00, FUTEX_WAKE_PRIVATE, 1) = 1
write(3429, "\1\1\1C\2\3def\7redacted\vredacted"..., 94) = 94
setsockopt(3429, SOL_SOCKET, SO_RCVTIMEO, "\200p", 16) = 0
read(3429, 0x7f667f0e75c0, 4) = -1 EAGAIN (Resource temporarily unavailable)
fcntl(3429, F_SETFL, O_RDWR) = 0
read(3429, "\7", 4) = 4
...

That looks a little more normal.

So, we started googling and reading docs. This seemed like something other people had to have hit before – we weren’t doing anything *that* outlandish.

It turns out there are known problems with mutexes in MySQL 5.5, with a couple of proposed solutions: tweak the innodb_sync_spin_loops and innodb_spin_wait_delay variables (which affect mutex loops), or manually specify a value for innodb_thread_concurrency.

In the end, we opted to manually specify the concurrency level, because it felt less like voodoo.

We had to change one line in our my.cnf:
innodb_thread_concurrency = 0

Zero means “disable thread concurrency checking.” It’s the default in MySQL 5.5, and it allows InnoDB to create as many threads as it needs.

Switching this away from zero (to basically anything nonzero; we tried 2, 4, 16, 32, and 64) completely resolved the issue and gave consistent 90K+ queries per second in testing.

We settled on
innodb_thread_concurrency = 32

and got the service back in production, where it’s been handling ~50K qps for weeks now.

TL;DR: Don’t trust InnoDB to manage its own threads.

If you're interested in what we work on, please apply - we're hiring: http://mixpanel.com/jobs/

Leave a Reply

Your email address will not be published. Required fields are marked *