Reair - master-master replication

42 views
Skip to first unread message

Zheng Shao

unread,
Dec 14, 2016, 9:43:36 PM12/14/16
to reair
Hi,

Master-master replication allows Hive in 2 data centers to be writable at the same time.

Did anybody think about what needs to be done to allow Reair to replicate bi-directionally to support master-master replication?

--
Zheng

Zheng Shao

unread,
Dec 20, 2016, 9:36:48 PM12/20/16
to reair
--
Zheng

Paul Yang

unread,
Jan 3, 2017, 2:24:49 PM1/3/17
to Zheng Shao, reair
Sorry for the delay! Many of us were out for the holidays and are just catching up on the backlog.

Since conflict resolution is difficult in incremental replication, the only viable approach for "master-master" that I can think of is to segregate the datasets so that a given table is written to from only one cluster. Then, you could setup replication processes using custom replication filters so that the replication processes don't cause conflicts as they run.

Would something like that work for the case you're thinking of?

--
You received this message because you are subscribed to the Google Groups "reair" group.
To unsubscribe from this group and stop receiving emails from it, send an email to airbnb-reair+unsubscribe@googlegroups.com.
To post to this group, send email to airbnb...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/airbnb-reair/CAAguJ7oNNhdtW7kUx8pgqCHRHM6%2BG9f%2Bsbz6x-yvw0CuL-O_BA%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

Zheng Shao

unread,
Jan 3, 2017, 4:37:46 PM1/3/17
to Paul Yang, reair
Our use case looks like this:

1. There are two Hive clusters: A and B
2. Pipeline can run on either A or B cluster depending on which cluster has free resources.
3. The result should be replicated to the other cluster.

A simple conflict resolution like "last write wins" can work well.  The assumption (which is true for most cases) is that writes in Hive is mostly done by ETL pipelines, and there are usually a single writer for a single table any way.

The implementation can be pretty simple:  Add an option to Reair that does not overwrite "newer" table/partitions.  Then we can run 2 Reair to make the master-master replication work.

How does that sound?

--
Zheng

Paul Yang

unread,
Jan 3, 2017, 9:20:29 PM1/3/17
to Zheng Shao, reair
Yeah, I think that will work with the given workload. Were you planning to use batch or incremental? 

For incremental, one approach might be to make changes to the ObjectConflictHandler class and how it's used so that it can skip the copy task under those conditions. I wanted to make that class plug-able as well, but it hasn't been done yet.

Zheng Shao

unread,
Jan 3, 2017, 9:40:47 PM1/3/17
to Paul Yang, reair
We are planning to start with the batch first.   At some point later this year, we will likely move to incremental to cut down the replication delay.



For more options, visit https://groups.google.com/d/optout.



--
Zheng

Zheng Shao

unread,
Jan 5, 2017, 5:48:26 PM1/5/17
to Paul Yang, Jingwei Lu, reair
Hi Paul/Jingwei,

The patch is ready.  Can you help take a look at it?

Zheng

--
Zheng

Zheng Shao

unread,
Jan 11, 2017, 7:45:12 PM1/11/17
to Paul Yang, Jingwei Lu, reair
Hi Paul,


I added a test and fixed the bug in the code.  Can you take a look?

Zheng

--
Zheng

Zheng Shao

unread,
Jan 12, 2017, 12:57:59 AM1/12/17
to Paul Yang, Jingwei Lu, reair
Hi Paul,

I just made the warning messages more clear for debugging.

Can you take a look again?  https://github.com/airbnb/reair/pull/56

Zheng

--
Zheng

Reply all
Reply to author
Forward
0 new messages