Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: tungsten-replicator (was: Re: Persona data storage improvement discussion and next steps)

6 views
Skip to first unread message

Sheeri Cabral

unread,
May 28, 2013, 11:14:09 AM5/28/13
to 6a...@mozilla.com, Mozilla Services Operations, bban...@mozilla.com, Gene Wood, dev-id...@lists.mozilla.org
Hi Jared,

I have had time to take a look. Tungsten Replicator will work for our needs, but we still need to worry about things like autoincrements and duplicate keys. Tungsten makes managing replication among MySQL clusters in multiple data centers with multiple masters much easier, and has auto-failovers and neat stuff like that, but the underlying need to avoid duplicate data still needs to be solved (like using auto_increment_increment and auto_increment_offset).

-Sheeri Cabral
Manager, Systems DB Team
Senior DB Admin/Architect
Mozilla

----- Original Message -----
From: "Jared Hirsch" <jhi...@mozilla.com>
To: "Sheeri Cabral" <sca...@mozilla.com>
Cc: dev-id...@lists.mozilla.org, "Mozilla Services Operations" <servic...@mozilla.com>, bban...@mozilla.com, "Gene Wood" <ge...@mozilla.com>
Sent: Tuesday, May 14, 2013 1:36:09 PM
Subject: tungsten-replicator (was: Re: Persona data storage improvement discussion and next steps)

Hey Sheeri,

Curious if you've had a chance to evaluate Tungsten[1] a bit more as our MySQL multi-region multi-master replication engine. Does it still look like our best option? Are there others that you'd like to investigate before we pick one to prototype?

Thanks,

Jared

[1] http://code.google.com/p/tungsten-replicator/


On Sun, May 12, 2013 at 9:39 AM, Gene Wood < ge...@mozilla.com > wrote:


TL;DR: We discussed higher availability database options in a work
week session, and we identifies three solutions to prototype:
A) sticking with a single-master MySQL cluster, focusing on improving
failover and error messaging;
B) a multi-master, multi-region MySQL cluster (possibly using Tungsten);
C) a multi-region Cassandra ring.
*We'd like suggestions from the community on solutions that we didn't
consider which satisfy the requirements.*

This email follows on an initial DB planning message that Jared sent a few
weeks ago[1], see there for background and a list of requirements.

We were able to talk through and call out some very specific constraints
and opportunities related to our data storage choice :
* Low read latency is not very important because so much of persona is
intentionally CPU bound, effectively hiding any other latency behind 500ms
of compute time
* The read/write ratio is very read heavy and very write light
* Any existing instances in persona of writes followed closely by reads are
not desired/required and will be removed. This effectively removes a need
for immediate consistency
* The data set is small and is expected not to grow beyond the storage on a
single server effectively removing the need to shard the data.
* We need the data to be highly available such that within a given
datacenter/region, we can stand the loss of a host and across the world we
can stand the loss of an entire datacenter/region without human
intervention. This is to have high availability to avoid service downtime.
This relates to data replication needs. We are ok without immediate
consistency such that some writes (in that they're infrequent) could be
lost during failover.
* Though we're intolerant of having reads not be highly available, we
are tolerant of write outages of somewhat short durations.
* The data structure/schema is so simple that we don't have any needs for
advanced data search functionalities (SELECT WHERE, ORDER BY etc.). We only
ever look at data for a single user at a time.

Here is more detail on the prototypes listed above that we hope to
implement :
A) Installing ScaleBase[2] or some other tool which will automate the
process of failover. Possibly look into MySQL 5.6[3] which provides more
master promotion options than the existing version
B) Sheeri Cabral is going to look into Tungsten and let us know how she
sees it fitting with our needs. If it looks applicable we'll bring up a
prototype.
C) With some consultation with Ben Bangert we're going to bring up a
Cassandra installation

-Gene

[1]
https://groups.google.com/d/msg/mozilla.dev.identity/kRzXJNfmQmI/lu4qCIFRUs8J
[2] http://www.scalebase.com/
[3]
http://dev.mysql.com/tech-resources/articles/whats-new-in-mysql-5.6.html#replication
_______________________________________________
dev-identity mailing list
dev-id...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-identity

0 new messages