backup redis to mysql

3,523 views
Skip to first unread message

Mikouponz Laba

unread,
Sep 6, 2012, 3:28:11 AM9/6/12
to redi...@googlegroups.com
Hello,

I want to backup my redis database to mysql. 

What would be the best architecture?

I am thinking of the following:

1. Create a Slave for the Primary Database on a different server.
2. Write a Node.js application on top of the Slave to read all the data.
3. Create SQL queries and insert into the mysql database.

Is there any better way to do this? And how do I know when the Redis Slave instance is updated with data.

Thanks,
Santosh

Josiah Carlson

unread,
Sep 6, 2012, 10:00:57 AM9/6/12
to redi...@googlegroups.com
Oh, you want to sync directly to a database...

You can use the MONITOR command, but that also produces read queries
to your client (and doesn't give you any backlog). Depending on your
operations, this may overwhelm any sort of db sync. Also, I'm not sure
that node.js could keep up.

You could build your own slave via 'slaveof', but you would need to
implement a parser for the snapshot file to be consistent, and there
is no incremental sync.

In my opinion, the best option would be something that read and parsed
the AOF from disk. That would bypass the network completely, which
would prevent out-of-memory errors on slow client processing. Though
you would have to resync your entire database on AOF rewrite.

How often are you writing data to Redis? Are you sure your MySQL can
handle the write volume? Why don't you write to MySQL first, then have
a post-commit hook on an ActiveRecord-style interface that induces a
Redis write out-of-band?

Regards,
- Josiah
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.

Felix Gallo

unread,
Sep 6, 2012, 10:37:30 AM9/6/12
to redi...@googlegroups.com
I think you misunderstood him, Josiah.  Seemed to me he wanted to know if the correct architecture was as follows:

1.  Master (in production, getting beat on by production clients)
2.  Slave
3.  ETL Process (here, written in Node) which connects up to the slave as a normal client, reads changed data, and pushes that changed data into the DB.
4.  DB

and if that's the question he was asking, then the answer is yes, that design works great for me, and the general pattern of setting up a slave in order to do 'hard work' (e.g., long-running analytic queries, offloading archival data to a db, saving data to a db for later analytics, etc.) is a common one which is excellent, but:

a.  depending on use case it may be easier/simpler to merely periodically copy off the AOF files to another server and start up a new redis instance there without having it be an explicit slave, because the slave semantics include the fact that the slave will always attempt to catch back up to the master all the time, which may not be what you want if you are trying to capture a point in time.

b.  be sure you understand the way that syncing works, because in some cases and with some data sizes the master/slave relationship and action sequence is not immediately intuitive.

c.  in common (e.g. cloud) environments, network bandwidth may be limited to a tiny fraction of redis's throughput.  Carefully consider your network architecture so that you don't get into a situation where your slave is either too busy to respond to syncs, or is behind such a thin pipe, that it gets into a situation where there is no way to catch up.

Didier Spezia

unread,
Sep 6, 2012, 11:24:26 AM9/6/12
to redi...@googlegroups.com

At least for the initial copy, I would not use a slave, but rather generate
a dump (bgsave) on the master, and use Sripathi Krishnan's wonderful
package to extract data from the dump and populate the MySQL database
with a Python script.


Regards,
Didier.

Mikouponz Laba

unread,
Sep 6, 2012, 11:33:13 AM9/6/12
to redi...@googlegroups.com
How often are you writing data to Redis? Are you sure your MySQL can
handle the write volume? Why don't you write to MySQL first, then have
a post-commit hook on an ActiveRecord-style interface that induces a
Redis write out-of-band? 

- I will be writing every 5 mins or so to the mysql DB. The mysql DB contains other relational data and will be used for analytics and reports.

Thanks,
Santosh

Mikouponz Laba

unread,
Sep 6, 2012, 11:46:51 AM9/6/12
to redi...@googlegroups.com
Yes Felix, you are right in understanding.


1.  Master (in production, getting beat on by production clients)
Don't want to add any overhead on the Master apart from the gaming engine.
2.  Slave
It should be there for failover. But not mandatory. 

3.  ETL Process (here, written in Node) which connects up to the slave as a normal client, reads changed data, and pushes that changed data into the DB.
Yes. You are right. However,  am still not clear, how the ETL process would understand which data has been changed on the slave.
4.  DB


a.  depending on use case it may be easier/simpler to merely periodically copy off the AOF files to another server and start up a new redis instance there without having it be an explicit slave, because the slave semantics include the fact that the slave will always attempt to catch back up to the master all the time, which may not be what you want if you are trying to capture a point in time.

Yes, I want to capture a point-in-time data. But is it a good idea to load the AOF file every 5 mins? Or is there a better solution? 


Thanks,
Santosh

Sripathi Krishnan

unread,
Sep 6, 2012, 11:52:32 AM9/6/12
to redi...@googlegroups.com
Since you want to capture point in time data, you should simply parse the dump.rdb file and update the MySQL database.

Configure redis to save dump.rdb every 5 minutes, rsync the file to another server, use one of the several rdb parsers to parse the file, and then update your mysql schema. This process doesn't add any overhead to your redis master or slave.

--Sri

Felix Gallo

unread,
Sep 6, 2012, 11:55:00 AM9/6/12
to redi...@googlegroups.com
As I'm fond of saying, Redis is not a database, it's a database construction kit.  So if you want your ETL process to 'know' when data has been changed, either you want (as Josiah noted) MONITOR, or you want to roll your own logic for doing that.

In my facebook game architecture, I rolled my own.  My front end keeps track of all of the open sessions, which correlate to user accounts.  If a session seems to have ended -- i.e., I haven't gotten an API request or a heartbeat within a reasonable time period -- then I figure that the user is done for now and add their id to a queue to be processed.  The back end workers then watch that queue, and for each entry on the queue, they lock the records, save the data to the database, and [optionally, depending on the fullness of the redis cache at that time] delete the account from the redis cache.

Building the logic and the queue depends on you, but the good news is that you control every aspect of the solution, so you likewise maintain complete control over the ETL system's performance and resource envelope.

F.

Javier Guerra Giraldez

unread,
Sep 6, 2012, 12:11:32 PM9/6/12
to redi...@googlegroups.com
On Thu, Sep 6, 2012 at 10:55 AM, Felix Gallo <felix...@gmail.com> wrote:
> As I'm fond of saying, Redis is not a database, it's a database construction
> kit. So if you want your ETL process to 'know' when data has been changed,
> either you want (as Josiah noted) MONITOR, or you want to roll your own
> logic for doing that.

One thing i'd love to see (or to have the time to code myself) is a
slave-protocol Redis library. Either a new library or maybe new
functionality for hiredis.

Ideally, you would create a connection and install some simple filter.
lets say, you're interested in SET, INCR, DECR, and DEL commands,
over keys that follow some pattern. then you wait for events. The
library connects to the server as a new slave, getting all
data-modifying commands, but silently discards those not matching the
filter. those that do match, are reported to the application.

Maybe not a library... what about a new 'monitor gateway' mode, that
connects as a slave. client applications connect to it (or Lua
scripts?), use some commands to create a filter (as above) and
associate them to message channels. the advantage is that there's no
need for new client libraries, and slow applications wouldn't tax the
master.

or am i complicating things needlessly?

--
Javier

Dvir Volk

unread,
Sep 6, 2012, 12:23:52 PM9/6/12
to redi...@googlegroups.com
I recently experimented with a simple patch that makes redis not do a full sync when a slave connects to it if it doesn't want it.
this allows you to just receive push updates to the slave simply in the redis protocol, so writing a client for it should be trivial.
What I wanted to do with it is create a local LRU cache of "hot" data, that may be updated for simple keys, or invalidated for complex data structures.
but it can also allow stuff like prefix search based on keys, full text indexing, and basically extending redis without touching it.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.




--
Dvir Volk
Chief Architect, Everything.me

M. Edward (Ed) Borasky

unread,
Sep 6, 2012, 1:16:56 PM9/6/12
to redi...@googlegroups.com
On Thu, Sep 6, 2012 at 8:52 AM, Sripathi Krishnan
<sripathi...@gmail.com> wrote:
> Since you want to capture point in time data, you should simply parse the
> dump.rdb file and update the MySQL database.
>
> Configure redis to save dump.rdb every 5 minutes, rsync the file to another
> server, use one of the several rdb parsers to parse the file, and then
> update your mysql schema. This process doesn't add any overhead to your
> redis master or slave.
>
> --Sri

Yeah - that's what I was thinking. Or better yet, mirror both the
master and slave disk state to a third non-Redis server at the disk
level.
Twitter: http://twitter.com/znmeb; Computational Journalism Publishers
Workbench: http://j.mp/QCsXOr

How the Hell can the lion sleep with all those people singing "A weem
oh way!" at the top of their lungs?

Colin Vipurs

unread,
Sep 7, 2012, 4:30:41 AM9/7/12
to redi...@googlegroups.com
$ redis-cli -p <port> monitor | awk '/<command>/ { print $2,$3,$4 }'

> One thing i'd love to see (or to have the time to code myself) is a
> slave-protocol Redis library. Either a new library or maybe new
> functionality for hiredis.
>
> Ideally, you would create a connection and install some simple filter.
> lets say, you're interested in SET, INCR, DECR, and DEL commands,
> over keys that follow some pattern. then you wait for events. The
> library connects to the server as a new slave, getting all
> data-modifying commands, but silently discards those not matching the
> filter. those that do match, are reported to the application.
>
> Maybe not a library... what about a new 'monitor gateway' mode, that
> connects as a slave. client applications connect to it (or Lua
> scripts?), use some commands to create a filter (as above) and
> associate them to message channels. the advantage is that there's no
> need for new client libraries, and slow applications wouldn't tax the
> master.
>
> or am i complicating things needlessly?
>
> --
> Javier
>
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>



--
Maybe she awoke to see the roommate's boyfriend swinging from the
chandelier wearing a boar's head.

Something which you, I, and everyone else would call "Tuesday", of course.
Reply all
Reply to author
Forward
0 new messages