Best way to resync replicated database with the primary one (MSA)?

Alex Shafir

unread,

Oct 4, 2004, 11:36:27 AM10/4/04

to

When the replication gets stuck due to some problems/errors I’d like
to stop replication temporarily and then after the issues that caused
the problem have been fixed to resync the replicated database with
it’s primary one and resume replication. My current course of action:

1) Drop database subscription – because, most often,
DSI thread to the replication database is shutdown due to errors
it can only be resumed skipping offending transactions in queue;

2) Fix the problems;

3) Define database subscription again with “use dump marker”,
then dump primary database to a file and load the dump
to the replicated database;

4) Resume DSI thread to the replicated database.

It all kind of works, but there are two issues:

a) How to cleanly drop database subscription when you have offending
transactions in queue and DSI thread to replicated database keeps
shutting down?

b) After database subscription is recreated and replication resumed
there are often error messages #5070 “DSI: loss detected” =
in replication server error log – my guess here is that stable
queues have to be somehow reinitiated – I just don’t know how?

I don’t want to recover anything here. I just want to start database replication
from a clean slate again without disturbing replication for other databases.
It’s all on the same site, so, primary and replication RepServ is the one
and only here.

Any insight will be greatly appreciated.

Alex Shafir

Jeff Tallman

unread,

Oct 4, 2004, 5:06:24 PM10/4/04

to

I think the best way might be similar to a WS.....

1) Leave the database repdef in place
2) Copy out all the repagent configs for the MSA at replicate
3) Shutdown the primary rep agent
4) Dump/load the primary database to replicate
5) remove all rows with dbid >0 from rs_lastcommit (this is why you are
getting the loss detected messages - you are overwriting the destination
rs_lastcommit with primary)
6) sysadmin sqm_purge_queue, replicate, 0 (purge outbound queue)
7) resume primary rep agent
8) unfortunately without repdefs, you may still get some dupes (dupe key
error, etc.). You may want to create repdefs and enable autocorrection.
Otherwise, just skip all trans that fail (could be a lot, hence the
suggestion to look at autocorrection).
9) While that is catching up, you will need to disable the replicate
repagent, truncate the log and re-configure repagent with the replicate
settings.

gene

unread,

Oct 15, 2004, 3:16:26 PM10/15/04

to

Here gents another alternative which I could use:

1. stop rep agent
2. dbcc settrunc('ltm', 'ignore')

fix all the problems.
Onse it's done, syncronize db's with your choice either by
bcp's, dumps.

enable replication:
1. dbcc settrunc('ltm','valid')
2. sp_start_rep_agent.

It seems easy enough.

Gene.
gene_...@hotmail.com

Alex Shafir

unread,

Oct 15, 2004, 5:34:20 PM10/15/04

to

Unfortunately, it's not that easy. There may be transactions left in
replication server queues that it may try to apply to the replicate
database although it has been already synced against primary.

gene

unread,

Oct 18, 2004, 4:18:16 PM10/18/04

to

yes you are right on that.
Hovewer I assume by the time you need to start replication
again all trunsaction will be applied to replicated.
If you have problems between rep server and replicated -
check the RS log and figure out the problem.
Sometimes you required to resume with skip tran and/or
truncate data in standby few times to let SQ trunsactions
flawing into RD.
It usually quite simple to drain SQ.
Once you are ready to get replication on and use
dumps/loads, few thinkgs are commended:
before dbcc ltm run rs_zeroltm.
Before load to backup, bcp out rs_lastcommit and then 'in'
after you load. Otherwise you might be required to run
'ignore loss command' after the load which is not big deal
too.