full database sync to newdb

23 views
Skip to first unread message

Zane Williams

unread,
Oct 8, 2009, 5:48:47 PM10/8/09
to rubyrep
I've been looking through the configs and docs, but haven't found
anything that explicitly says this isn't possible. So I thought I
would ask here. Basically, I have two databases that I want to sync
using rubyrep. The "left" version is the one in production, and fully
populated. The "right" is a new import of the basic structure,
missing some tables that are added after data is populated(via a
second python interface to the db).

The current behavior, is that only tables that exist on both sides are
replicated, and tables that only exist on the "left" side are
omitted. Is there a config option that would enable rubyrep to create
the tables, and then sync them?

The end goal, will be to use heartbeat to load balance an IP address
between the two servers, and if postgrest should die/something bad
happen/, the IP would move via heartbeat to the secondary server.
This would in theory provide the redundancy I need, and maybe even
high availability depending upon how quick I can get heartbeat to poll
and switch. Comments, conerns, rants?

Thanks,

Zane

Tom Copeland

unread,
Oct 8, 2009, 8:08:18 PM10/8/09
to rub...@googlegroups.com

That's exactly what I'm looking for as well.

Yours,

tom

Nishkarsh Kulshreshtha

unread,
Oct 8, 2009, 11:27:24 PM10/8/09
to rub...@googlegroups.com
Hi,

RubyRep works on triggers and which can only be defined on existing tables. So it is not possible to have a replication which involves any DDL operation i.e. only DML operations can be replicated.

Regards
Nishkarsh
--
c u
Nishkarsh(Nikky)

Tom Copeland

unread,
Oct 9, 2009, 12:55:45 AM10/9/09
to rub...@googlegroups.com

On Oct 8, 2009, at 11:27 PM, Nishkarsh Kulshreshtha wrote:

> Hi,
>
> RubyRep works on triggers and which can only be defined on existing
> tables. So it is not possible to have a replication which involves
> any DDL operation i.e. only DML operations can be replicated.

Hm... couldn't TableSpecResolver#resolve create the tables on the
destination side rather than just raising that exception?

Yours,

tom

Arndt Lehmann

unread,
Oct 9, 2009, 2:00:59 AM10/9/09
to rubyrep
Hi All,

great discussion.
And no doubts: automatic replication or at least manual syncing of
schema changes would be a nice feature.

There are 2 reason why I didn't implement it:
1.
The simple implementation would not work correct in many cases. And a
correct implementation would be quite hard. Worse: to make it correct
I would have to add tons of database specific logic.
(I. e.: I am lazy ;-)

2.
There are certain classes of problems where no algorithm could know
what the correct steps for syncing the schema are.


Regarding (1):
The simple solution would be to rely on the ActiveRecord (the used
database access library) functionality for schema changes.
Problem:
ActiveRecord only deals with basic features like creating tables and
adding / removing columns.
There is no support for constraints (foreign key or anything else), no
support for indexes, table spaces, etc.
So to correctly implement this, I would have to develop tons of
database specific code.

Regarding (2):
There are scenarios where no automatic algorithm could correctly sync
schema changes.
Example:
A column in renamed. An automatic algorithm has no way to know if
* there was indeed a column rename OR
* if the old column was deleted and a new one added

Before I implement a feature that only works in 9 out of 10 cases and
in the rest does horribly wrong things, I rather don't add this
feature at all.


Having said all that, there are Open Source developers who have
addressed this problem.
E. g. this is one of the tools that provides schema compares and syncs
http://www.sql-workbench.net

Also, for those that develop / maintain a web application and need to
update multiple database servers with continuously upcoming schema
changes, look at something like ActiveRecord schema migrations
http://api.rubyonrails.org/classes/ActiveRecord/Migration.html
In my option the superior approach as it neatly solves problem (2).
[Similar tools are also available for other programming languages.]

Regards,
Arndt

PS: But this is open source. So if anybody wants to give it a go, I
don't want to discourage you.

Tom Copeland

unread,
Oct 9, 2009, 10:26:49 AM10/9/09
to rub...@googlegroups.com

Hi Arndt -

Thanks for rubyrep! And you're right, those concerns are very
valid... I agree, this problem is extremely tricky and it's easy for
something to go terribly wrong.

I'm wondering if there's some way to make it workable in the certain
narrow case where we're using Rails+ActiveRecord+PostgreSQL and only
using rubyrep as a hot standby, so schema changes (and writes in
general) are only made to one DB. Let's see, what situations would
we hit:

- if a new table is created in the "left" db, add triggers to new
table in "left" db and create the table in the "right" db

- if a table schema changes in the "left" db, drop that table in the
"right" db, recreate it, and resync This would mean a complete rsync
of that table, which would be terrible for large tables... but it
would work. Alternatively, we could hack ActiveRecord migrations to
record DDL changes somewhere and rubyrep could pick those up. That
would be really tricky though... very failure-prone.

Hm... trying to think of any other situations. I agree, this is a
doozy...

Yours,

Tom

Reply all
Reply to author
Forward
0 new messages