Redis database data and version control

653 views
Skip to first unread message

Vsevolod Dyomkin

unread,
Dec 19, 2012, 4:02:07 AM12/19/12
to redi...@googlegroups.com
Hi,

I have a use case for Redis as a static dictionary server, which is distributed to many nodes and used there read-only, but is also updated occasionally in a central place.
For distributing updates (and due to some other concerns) I store the RDB database in source control, namely, git, and it kind of works.
But I face 2 problems, which I wanted to discuss here, because they seem quite basic, so someone may have already come up with a solution.

Problem 1: because RDB is binary, you can't do the diff. I've made a simplistic tool, which helps me keep track of the changes to some extent, but its tailored to my particular database structure, and doesn't give all the information. Maybe, there's already some existing more general solution?

Problem 2: when you update the existing database via git, to make server re-read it you need to restart the server. Is there a way to make redis reload the RDB file without restarting?

Thanks for any suggestions,
Vsevolod

Marc Gravell

unread,
Dec 19, 2012, 4:23:15 AM12/19/12
to redi...@googlegroups.com
Would AOF be a better option? This would *seem* (on the surface) to address most of these issues. (http://redis.io/topics/persistence)

Marc


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/7vEMudKs5W4J.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.



--
Regards,

Marc

Salvatore Sanfilippo

unread,
Dec 19, 2012, 4:29:14 AM12/19/12
to Redis DB
Hello,

About diffs, it's very hard and AOF will not work as well because
every time you generate an AOF keys and single elements of aggregate
data types may be shuffled from the point of view of the order of
generation, otherwise BGREWRITEAOF could do the trick.

So to diff you probably want to use an RDB -> Json tool or something
like this, and specifically one that will sort keys lexicographically.

To reload a DB you could use:

DEBUG RELOAD

But it's a debugging command so I can not ensure that it will work
well. It is used only inside the Redis test suite so the approach
should be something like, test it, if it works for you, use it, but it
is not something supported officially ;)

Salvatore
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/redis-db/-/7vEMudKs5W4J.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.



--
Salvatore 'antirez' Sanfilippo
open source developer - VMware
http://invece.org

Beauty is more important in computing than anywhere else in technology
because software is so complicated. Beauty is the ultimate defence
against complexity.
— David Gelernter

Vsevolod Dyomkin

unread,
Dec 19, 2012, 4:33:05 AM12/19/12
to redi...@googlegroups.com
Salvatore

Thanks a lot for the suggestions! I'll try them and followup, if I find something interesting.

Best,
Vsevolod

Bret A. Barker

unread,
Dec 19, 2012, 9:24:26 AM12/19/12
to redi...@googlegroups.com
The way I've done this before is to just store changes to the dataset in redis-cli form as separate .redis files. Then track those in source control and apply during a release. After applying the file(s), send a pubsub message that all the app servers can use to reload the dataset (assuming they were caching locally).

E.g. you could have a file that is 'release-20121219.redis', containing:

# fix color of TPS report covers
set tps/cover/color "ffffcc"
# turn on fancy new feature
set features/fancy "1"

Then run that against your redis master:

grep -v "#" release-20121219.redis | redis-cli -h mymaster

-bret

Josiah Carlson

unread,
Dec 19, 2012, 11:15:03 AM12/19/12
to redi...@googlegroups.com
Why not use built-in Redis master/slave support? If data is rarely
updated, then there shouldn't be a lot of traffic. Is it a large
dataset, so initial syncs would be expensive? Is it over slow network
connections?

- Josiah

Sripathi Krishnan

unread,
Dec 19, 2012, 6:39:34 AM12/19/12
to redi...@googlegroups.com
re. RDB and Diffs

redis-rdb-tools has support for doing diffs for a while - https://github.com/sripathikrishnan/redis-rdb-tools. See the section comparing rdb files


--Sri


To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/XdQ6YPUdy0EJ.

Javier Guerra Giraldez

unread,
Dec 19, 2012, 11:50:10 AM12/19/12
to redi...@googlegroups.com
On Wed, Dec 19, 2012 at 11:15 AM, Josiah Carlson
<josiah....@gmail.com> wrote:
> Why not use built-in Redis master/slave support?

that would be the cleanest solution, but until there's partial resync
suport, any accidental disconnection means reloading all the data
again. right now it's unavoidable on a true master/slave, but on this
case sounds like unnecessary fragility.

--
Javier

Josiah Carlson

unread,
Dec 19, 2012, 12:10:23 PM12/19/12
to redi...@googlegroups.com
Relying on custom RDB syncing/diff solutions isn't fragile?

I'd argue that Redis master/slave is far more tested, and as such, is
far more reliable than custom solutions.

The lack of incremental resync only matters if dump size is
substantial, network bandwidth is limited, or the network connection
is spotty. Hence my asking questions relating to such in my earlier
reply.

Regards,
- Josiah

Javier Guerra Giraldez

unread,
Dec 19, 2012, 12:17:25 PM12/19/12
to redi...@googlegroups.com
On Wed, Dec 19, 2012 at 12:10 PM, Josiah Carlson
<josiah....@gmail.com> wrote:
> Relying on custom RDB syncing/diff solutions isn't fragile?

if it's infrequent enough, it would be treated like code rollouts,
part of the deployment cycle. i wouldn't call that fragile.
human-dependent, yes, and that's why it's only reasonable on _really_
infrequent updates (i'd say the limit is once a week or less)


> I'd argue that Redis master/slave is far more tested, and as such, is
> far more reliable than custom solutions.

it's definitely attractive. but on week-long connections, spurious
glitches are quite probable. no big deal, of course. and while it's
undesirable for it to happen, it would still be cool to see it
reloading automatically :-)


> The lack of incremental resync only matters if dump size is
> substantial, network bandwidth is limited, or the network connection
> is spotty. Hence my asking questions relating to such in my earlier
> reply.

on that i agree completely

--
Javier

Vsevolod Dyomkin

unread,
Dec 19, 2012, 10:42:57 PM12/19/12
to redi...@googlegroups.com
Thanks, this is just what I need!

Vsevolod Dyomkin

unread,
Dec 19, 2012, 10:45:09 PM12/19/12
to redi...@googlegroups.com
That's an interesting idea to consider, but as pointed below, it puts additional constraint of having a master node, which is constantly connected to slaves. So, overall, I think the approach is feasible, but in my specific use-case I prefer to go with rdb-tools + DEBUG RELOAD.

Josiah Carlson

unread,
Dec 20, 2012, 12:32:05 AM12/20/12
to redi...@googlegroups.com
You've got it backwards. Running custom scripts is overhead.

Running an extra copy of Redis as the master and using Redis
configured as slaves is essentially fool-proof and dirt simple. You
can configure Redis slaves to serve old data if the connection to the
master fails for some reason. No need to worry about 'diffs', shipping
rdbs, etc.

You can also alternate between 'slaveof no one' and 'slaveof ip port'
to sync your changes to the slaves. Even this is significantly more
fool-proof than the proposed alternatives.

Regards,
- Josiah
> https://groups.google.com/d/msg/redis-db/-/Py4gfuR7IhEJ.
Reply all
Reply to author
Forward
0 new messages