Using MASTER_GTID_WAIT() to avoid stale reads from slaves in replication|
I have just implemented MASTER_GTID_WAIT() in MariaDB
10.0. This can be used to give a very elegant solution to the problem of
in replication read-scaleout, without incuring the overheads normally associated with synchronous
replication techniques. This idea came up
recently in a discussion with Stephane Varoqui, and is similar to the concept
of Lamport logical clock described in this Wikipedia
I wanted to describe this, hoping to induce people to test and maybe start
using this, as it is a simple but very neat idea, actually.
A very typical use of MariaDB/MySQL replication is for read-scaleout.
The application does all updates against a single master, which replicates to
a set of slaves. The application can then distribute its reads among the
slaves. If most of the database load is from reads, then this is an effective
way to scale beyond what a single database server can handle.
The problem of stale reads occurs since MariaDB/MySQL replication is
asynchronous. There is some delay from the commit of a change on the
master server until that change becomes committed and visible on a slave
server. This problem can be quite annoying in many cases, eg. user changes
some preferences and reloads the page (read goes to a different slave server
which is not caught up); now the change looks as if it was lost.
The idea is to use the MariaDB Global Transaction
ID (GTID for short) as a logical
clock. When we do an update on the master, we can obtain the associated
@@LAST_GTID. Then on the slave, we can execute a
MASTER_GTID_WAIT(<GTID from master>, <timeout>) before doing a
query for which read consistency is critical; this will ensure that the slave
will catch up before the query is run, and that no stale read results. Or if
we get a timeout, we can try another slave server, which hopefully is not
lagging as much behind.
Typically, we would pass around the GTID (the logical clock value) between
different parts of the system to maintain consistency where it is needed and
avoid stale reads. Eg. in a web application, we would set a cookie with the
GTID when an update is made to the database. Then on the next HTTP request, we
have the GTID which is needed in
MASTER_GTID_WAIT(), even if it
goes to a completely different web server. Or if we send an email, we can
encode the GTID in any contained link, to be sure that it will work whichever
slave server it might end up in, possibly in a different data center on a
By passing around the GTID whenever we communicate between parts of the
system, we can ensure that if transaction A is known to have occured before
transaction B, then any change made by A will also be visible to B. If there
was no communication (direct or indirect), then B cannot know that A came
before - and if there was communication, we can avoid stale reads by passing
the GTID as part of that communication.
The great thing about this technique is that it is optional. We can
use it for just the queries where avoiding stale reads is critical, and only
there take the associated penalty (in the form of increased latency). Other parts of the
database or application will not be negatively affected. This is much more
flexible than using some form of synchronous replication, which will incur
some penalty for all parts of the system.
So for applications that require more performance than what is currently
possible to get from fully synchronous replication, we can choose consistency
in a few critical places where it matters, and go for the performance of
asynchronous replication in the rest.
I am hoping that some users will start to do tests with this, and I am very
eager to hear any feedback and suggestions for further improvements (the maria-developers@ list is a
good place for those). I think this can become a very useful technique in many
By the way, note that it is not necessary to actually switch the replication
to GTID to use this technique, as in MariaDB the GTIDs are always generated
and replicated, even if they are not used for the slave when connecting to the
master. This should make it simple for people to start testing. One will need
to run MariaDB 10.0 on both the master and the slave, of course - but this
could be achieved by setting up one MariaDB server as a slave of the main
system (which could be running MySQL or Percona Server or MariaDB 5.5 or
whatever), and then setting up another MariaDB 10.0 slave using the first slave
If this turns out to be something that people find generally useful, it would
be interesting to integrate this more tightly into the client protocol and
client API. The
@@LAST_GTID could be sent automatically back with
the query results, the same way that
done. And another option could enable the automatic execution of the
MASTER_GTID_WAIT() call before queries sent to the slave
servers. This could make the technique almost transparent for the end
application, and could help avoid the overhead of extra client-server
roundtrips and extra SQL parsing.
However, that is only worth doing if this turns out to be something that will
actually be shown to be useful in practice. So if you want to see that happen
eventually, then get testing and send in that feedback!
The code is available in
the 10.0 bzr
tree now, and will be in the next (10.0.9) release of MariaDB.
Tags: database, mariadb, mysql, performance, programming, replication