Would UUID as key allow for a safe multi-master setup?

Riyad Kalla

unread,

Dec 19, 2011, 1:25:59 PM12/19/11

to redi...@googlegroups.com

I am trying to (elegantly and efficiently) solve a replication conundrum across all AWS regions globally; Redis, for a multitude of reasons, fits my dataset particularly well.

I understand (from previous questions) that Redis has no conflict-resolution capacity, so an out-of-box master-master (or multi-master) replication configuration is strongly discouraged due to unpredictable behavior when Server 1, 2 and 3 all have conflicting values for user identified by key "johndoe" (for example).

I am curious if I enforce all IDs in my system to be UUIDs (and let's hand-wave away any potential conflicts), in that case, would setting up a master-master-master-* configuration in Redis across all 6 AWS regions actually work out pretty well?

I am fine with eventual consistency, I just want an elegantly simple way to keep my data sets in all regions in sync AND still allow writing to any region (as users are directed to their nearest server via GeoDNS).

Thanks for any feedback.

Best,

Riyad

Josiah Carlson

unread,

Dec 19, 2011, 2:01:24 PM12/19/11

to redi...@googlegroups.com

I think that you should explain what you mean by "master-master" or
"multi-master", because Redis has no such configuration available to
it. If you are talking about taking two Redis instances and making
them slaves of one another: that doesn't work as a master-master
strategy (it actually just induces a slave-sync loop and makes it
impossible to add/alter any data).

If you are talking about custom scripts to perform master/slave
failover, please explain how you do it, so that we can offer advice to
make it better, or kudos for doing it well.

Regards,
- Josiah

> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.

Riyad Kalla

unread,

Dec 19, 2011, 2:03:58 PM12/19/11

to redi...@googlegroups.com

Josiah,

Issuing a "SLAVE OF" on each instance to every other instance was what I was planning on doing; thank you for the quick clarification on the slave-sync loop, I wasn't aware of that.

Josiah Carlson

unread,

Dec 19, 2011, 2:40:26 PM12/19/11

to redi...@googlegroups.com

Also, Redis can only be a slave of 0 or 1 other instances. The moment
you say "slaveof X Y" for a second Redis, it stops slaving from the
earlier master and re-slaves to the other.

But yeah. Don't make loops in your Redis slaving configuration. It is
never the right thing to do.

Regards,
- Josiah

Riyad Kalla

unread,

Dec 19, 2011, 10:32:16 PM12/19/11

to redi...@googlegroups.com

Many thanks Josiah, you saved me a world of hurt here.

Thanh Tran

unread,

Dec 20, 2011, 2:16:43 AM12/20/11

to redi...@googlegroups.com

Hi all!
I want to store data like this: <Key, Value>. But from client, I want to get all Values to store in a map (with purpose of improving speed) --> How can I do that with redis?
Thanks so much!

Josiah Carlson

unread,

Dec 20, 2011, 4:20:07 AM12/20/11

to redi...@googlegroups.com

There are two ways.

You can use the plain "SET <key> <value>" or MSET to set your values,
followed by "KEYS" to get all of your keys, then get all of your data
with "GET <key>" or "MGET <key1> <key2> ...".

You can also use HSET/HMSET to create a hash of all of your values,
then use HGETALL to get all of the key/value pairs.

That said, I've not seen an application where what you are proposing
makes sense. Maybe you have some new and interesting thing you are
doing, but more likely, you are just wasting memory. If you describe
your actual problem, we are usually pretty good at offering
suggestions for either getting the most out of Redis, or pointing you
off to a solution that may be better suited.

Regards,
- Josiah

Thanh Tran

unread,

Dec 20, 2011, 5:11:33 AM12/20/11

to redi...@googlegroups.com

Hi!
   Thanks for your early response!
   "You can use the plain "SET <key> <value>" or MSET to set your values, followed by "KEYS" to get all of your keys, then get all of your data with "GET <key>" or "MGET <key1> <key2> ...". --> I don't want to use it because :
If the latency to transport data in network is k and expect n is the number of transportations --> The total latency is: k*n --> So i want to find another function to get all <key, value> in one time.
   In my application, named Feed Ranking, I want to rank all feeds of an user and then get only 200 feeds to appear in "Feed New". I store an index - lambda represented for "time decay" of an user for a day --> Begin a new day, a service will get all "lambda - index" of all users to a map <userId, lambda> and then my application can get it from inside --> speed up my application.

Josiah Carlson

unread,

Dec 20, 2011, 12:01:36 PM12/20/11

to redi...@googlegroups.com

On Tue, Dec 20, 2011 at 2:11 AM, Thanh Tran
<thanhtd.i...@gmail.com> wrote:
> Hi!
> Thanks for your early response!
> "You can use the plain "SET <key> <value>" or MSET to set your values,
> followed by "KEYS" to get all of your keys, then get all of your data with
> "GET <key>" or "MGET <key1> <key2> ...". --> I don't want to use it because
> :
> If the latency to transport data in network is k and expect n is the number
> of transportations --> The total latency is: k*n --> So i want to find
> another function to get all <key, value> in one time.

You can use MSET for 1 round trip to set, KEYS for 1 round trip to get
the keys, and MGET for 1 round trip to get all of the values.

Alternatively, you can use HMSET for 1 round trip to set, and HGETALL
for 1 round trip to set.

Even if HMSET/HGETALL/MSET/MGET didn't exist, you could still use
non-transactional pipelines to do all (except the KEYS/pipelined GET
calls) in 1 round trip.

> In my application, named Feed Ranking, I want to rank all feeds of an
> user and then get only 200 feeds to appear in "Feed New". I store an index -
> lambda represented for "time decay" of an user for a day --> Begin a new
> day, a service will get all "lambda - index" of all users to a map <userId,
> lambda> and then my application can get it from inside --> speed up my
> application.

You don't need to get all of the lambdas, you only need to get the
"best" 200. If you stored them as part of a zset, which holds
"members" and "scores", you can get the members with the 200 best
scores (depending on whether the best is higher or lower) with a
single command. You can set the values over the course of the day, and
have your app pull an updated list every few minutes, depending on how
quickly you want score changes to propagate.

It is my finding (after having developed a few such scoring methods)
that any time you have scores that "degrade" over time, it's usually a
mistake, because if you are looking to generate a total ordering on
items, you have to re-score all of your items... which is a waste. You
can typically flip the score over on it's head, and have scores grow
over time based on some event. Take the claimed Reddit scoring method:
http://amix.dk/blog/post/19588 . Any non-linearities with respect to
the score are related to upvotes/downvotes, and not time (the t term
grows linearly, and is based on the unix time epoch, so you can just
use the timestamp as unixtime). On the other hand, the score for
Hacker News has it's non-linearities only induced by time:
http://amix.dk/blog/post/19574 , which requires a bit of math to come
up with an alternate that behaves similarly.

If you can make your score based on some event with a fixed time
component, your score can be calculated once and set until another one
of those events occur. You can also update scores arbitrarily through
the day as events come in, and don't need to deal with "update the
world" stuff that is painful and expensive.

Regards,
- Josiah

Reply all

Reply to author

Forward