Benchmarking thread scheduling in group commit, part 2|
I got access to our 12-core Intel server, so I was able to do some better
benchmarks to test the different group commit thread scheduling methods:
This graph shows queries-per-second as a function of number of parallel
connections, for three test runs:
(see the previous post linked above for a more detailed explanation of the two
thread scheduling algorithms.)
- Baseline MariaDB, without group commit.
- MariaDB with group commit, using the simple thread scheduling, where the
serial part of the group commit algorithm is done by each thread signalling
the next one.
- MariaDB with group commit and optimised thread scheduling, where the
first thread does the serial group commit processing for all transactions at
once, in a single thread.
This test was run on a 12-core server with hyper-threading, memory is
24 GByte. MariaDB was running with datadir in
/dev/shm (Linux ram
disk), to simulate a really fast disk system and maximise the stress on the
CPUs. Binlog is enabled with
innodb_flush_log_at_trx_commit=1. Table type is InnoDB.
I use Gypsy to generate the client
load, which is simple auto-commit primary key updates:
REPLACE INTO t (a,b) VALUES (?, ?)
The graph clearly shows the optimised thread scheduling algorithm to improve
scalability. As expected, the effect is more pronounced on the twelve-core
server than on the 4-core machine I tested on previously. The optimised thread
scheduling has around 50% higher throughput at higher concurrencies. While the
naive thread scheduling algorithm suffers from scalability problems to the
degree that it is only slightly better than no group commit at all (but
remember that this is on ram disk, where group commit is hardly needed in the
There is no doubt that this kind of optimised thread scheduling involves some
complications and trickery. Running one part of a transaction in a different
thread context from the rest does have the potential to cause subtle bugs.
On the other hand, we are moving fast towards more and more CPU cores and more
and more I/O resources, and scalability just keeps getting more and more
important. If we can scale MariaDB/MySQL with the hardware improvements, more
and more applications can make do with scale-up rather than scale-out, which
significantly simplifies the system architecture.
So I am just not comfortable introducing more serialisation (e.g. more global
mutex contention) in the server than absolutely necessary. That is why I did
the optimisation in the first place even without testing. Still, the question
is if an optimisation that only has any effect above 20,000 commits per second
is worth the extra complexity? I think I still need to think this over to
finally make up my mind, and discuss with other MariaDB developers, but at
least now we have a good basis for such discussion (and fortunately, the code
is easy to change one way or the other).
Tags: mariadb, mysql, performance, programming