Replication of changes in wrong order violated foreign key constraints in PostgreSQL

174 views
Skip to first unread message

TonyB

unread,
Jan 5, 2011, 2:57:38 AM1/5/11
to rubyrep
Hi all,

We've just started using Rubyrep for our project to replicate our
PostgreSQL database in a master-slave configuration. Overall we've
been impressed, but have hit a problem where a small number of changes
to our :left db are not being replicated to the :right. The missed
changes show up as PGErrors in our rr_logged_events table and appear
not to be replicated to the :right db because of foreign key
violations.

We were careful to start out with fully sync'd databases and we can
see from the error messages that the referenced foreign keys are for
records that should be replicated *before* the failed changes. It
looks like ruby rep is occasionally trying to insert or update records
with foreign key references to changes that have yet to be applied.
I've seen reference to this issue here before, but there appears to
have been no solution.

I've looked through the rubyrep code and from what I can work out the
processing of the changes in the rr_pending_changes table is by ID,
which should ensure that changes are processed in the order in which
they are inserted into the table. I'm wondering if the triggers that
populate the change table are being called out of sequence somehow.
I'm not clear on how rubyrep handles changes that were part of a
single transaction. Are these applied atomically to the other
database?

Any help with this would be much appreciated. While the replication
is impressive the failure to replicate even a very small number of
records makes rubyrep much less dependable that we'd like.

Many thanks,

Tony.

TonyB

unread,
Jan 10, 2011, 5:08:44 AM1/10/11
to rubyrep
Hi all,

> they are inserted into the table.  I'm wondering if the triggers that
> populate the change table are being called out of sequence somehow.

Having thought a little more about this, I have a theory as to how FK
constraints can be violated during replication.

As far as I can tell RubyRep uses the triggers to record which table
rows have changed and when replicating the changes to the other
database uses the resulting rr_pending_changes to tell it what rows to
insert/update. Is it possible for a row to be inserted then changed
in one db before the row has been replicated to the other? If so, is
it therefore possible that a row that has yet to be inserted in the
second db is updated in the first to now contain a value in a FK
column and that the attempt to insert that row in the second db will
fail due to a FK constraint on that column?

To explain, our company sells insurance online and we like to allow
our customers return to our site to revisit their quotations. We have
a 'quotes' table to hold the quote data and a 'users' table for our
cusomers' user accounts. Our 'quotes' table has a FK column 'user_id'
that references 'users.id' and their is a constraint on this column.

We first insert into the 'quotes' table at the beginning of the quote
process. At some point later in the process a 'user' account can be
created and inserted into 'users' and then associated with the quote
by updating quotes.user_id with the id of the user. So far so good.
However, if the time elapsed between the initial creation of the entry
in 'quotes', the insertion of a 'user' and the update of the quote row
is small, then it can all happen *before* rubyrep has replicated any
of it. Since rubyrep would process changes in order, you'd hope that
the replication would be smooth.

However, I think what's happening is that rubyrep sees that it needs
to insert a row into 'quotes', before inserting into 'users', but that
the columns it inserts will include the FK user_id in its final state
thus breaking the FK constraint, because it hasn't yet processed the
'users' row.

Does that make sense? If so, what can be done about it. We've moved
to Bucardo for now, but I'd like to be able to return to Rubyrep for
its ease of use and because our app is a Rails app.

Arndt Lehmann

unread,
Mar 7, 2011, 6:08:10 AM3/7/11
to rubyrep
Hi Tony,

If you still want it, please try rubyrep version 1.2.0. Should have a
fix for this issue.

To be specific:
The proper way would be to
(1) Do the first insert of the failing record with the foreign-key
column being null.
(2) After the dependent record is created, update the record created
in (1) with the proper foreign key reference.

But that would require rubyrep to track dependencies between updates.
And it would require knowledge of which columns of a table are foreign
key references to which other tables / columns. rubyrep doesn't have
that.

So as an easier fix that should take care of most situations, rubyrep
simply retries inserts that failed due to missing dependent records at
the end of the replication run.

Best Regards,
Arndt

Tony Byrne

unread,
Mar 7, 2011, 6:11:36 AM3/7/11
to rub...@googlegroups.com
Arndt,

Thanks for the update!

On 7 Mar 2011, at 11:08, Arndt Lehmann wrote:

> Hi Tony,
>
> If you still want it, please try rubyrep version 1.2.0. Should have a
> fix for this issue.

Regards,

TonyB.

Reply all
Reply to author
Forward
0 new messages