The dangers of timestamp based database synchronization

17 views
Skip to first unread message

perv...@gmail.com

unread,
Jan 22, 2009, 12:45:55 AM1/22/09
to pervasync
We know that in synchronization you cannot simply record the SQL
statements on one database and replay them on another database. The
reason is that the client database typically only has a sub-set of
server database data and the same SQL statement could do different
things on server than on client. Instead, you should somehow track the
changes to records/rows and apply the changes to the destination DB.

The most popular change tracking method on server DB is based on
timestamps. It looks very straightforward. You add a timestamp column
to your table and update the timestamp column whenever you change a
row. This can be done in your SQL statement or through triggers. By
the way, deletes have to be tracked separately, maybe in a companion
table. Then at sync time, the sync client would come in with a last
sync timestamp and you select all the server changes that have a newer
timestamp than the last sync timestamp.

As I said, this is a widely used technique when people have to
implement their own sync logic. Also, some sync solutions put the
burden of change tracking on app developers and this is the number one
recommended technique. However, be aware of its pitfalls.
One obvious pitfall is system time. This is timestamp based so be
careful with system time. Don’t adjust system time even it is wrong.
Do not sync during time shifts between daylight saving time and
standard time.

There is a more serious problem with this technique that could cause
change loss. Let me try to explain it. The default isolation level for
Mysql with innodb and Oracle is “Read Committed”, which means that
others cannot see the changes before a transaction is committed. Let’s
say at time T1 you modified a record R1. The timestamp column would
have a value of T1. Then before you commit, a sync happened at T2 (T2
> T1). This sync could not pickup the change to record R1 since the
transaction was not committed and the change was invisible. At T3 you
committed the transaction and at T4 (T4>T3>T2>T1) you do another sync.
Guess what, you still won’t get the change to R1! It was committed but
it has a timestamp T1 that is older than last sync time T2. The client
will forever miss the change no matter how many times you sync.

This problem is not so well known and is very hard to workaround in a
production environment.

Fortunately Persasync (http://www.pervasync.com) has an innovatively
designed sync engine that can take care of all the sync issues for
you. You don’t need to have any timestamp column to worry about. Just
do your normal DB DML operations and the system would track the
changes for you and guarantee no change loss.
Reply all
Reply to author
Forward
0 new messages