Suspending HDR temporarily?

pretzel

unread,

Feb 4, 2010, 3:03:01 AM2/4/10

to

I would like to populate a bunch of tables in database 'B" with
consistent data from similar tables in database 'A'. Database 'A' is
the secondary of an HDR pair, whose primary is a production database.
To get consistent data, I could stop replication, and extract from
database 'A' while no other users are connected. That way I could use
database 'A' to extract consistent data from a snapshot of a
production database. The extractions are of sufficient size that it
is not feasible to attempt to get the data from actual production.

The only downside for what I want to do is that I would need to re-
initialize replication from scratch (physical restore of level 0
archive).

Is there a way to "pause" or "suspend" the replication process so that
I can extract "unflickering" data from the secondary, and then resume
replication without having to do a physical restore of a level 0?

I'm wondering if there is some way to cause the logs to just remain at
the primary, similar to what happens when initializing HFR from
scratch-- i.e., they just accumulate on the primary.until the
secondary is established (after a physical restore) with the 'onmode -
d secondary <primary>' command.

Thank you.

DG

Art Kagel

unread,

Feb 4, 2010, 6:39:48 AM2/4/10

to pretzel, inform...@iiug.org

There is a way but it's a bit back handed:

Checkpoint the primary and wait for the secondary to process the checkpoint.
Shutdown the secondary (so server A?)
Change the sqlhosts file on A so that server A can't connect to the primary (so change the host from what it is to 'noserverhere')
Bring the secondary online.
Take your extract
Shutdown the secondary
Repair the sqlhosts file
Restart the secondary and it will catch up.

This assumes you do not have DRAUTO set to convert the primary into a standalone server if it loses contact with the secondary and the secondary is not set to automatically become a primary when it loses contact.

Sketchy I know, but it will get the job done.

Art

Art S. Kagel
Advanced DataTools (www.advancedatatools.com)
IIUG Board of Directors (a...@iiug.org)

See you at the 2010 IIUG Informix Conference
April 25-28, 2010
Overland Park (Kansas City), KS
www.iiug.org/conf

Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on my employer, Advanced DataTools, the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference. Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.

_______________________________________________
Informix-list mailing list
Inform...@iiug.org
http://www.iiug.org/mailman/listinfo/informix-list

Art Kagel

unread,

Feb 4, 2010, 6:48:06 AM2/4/10

to pretzel, inform...@iiug.org

Another option, get my dbexport/dbimport replacement utility, myexport. You can run it on the secondary while it is still replicating since it does not lock the database. If you run myexport in parallel mode, with the exception table option enabled, at a time when the primary is experiencing few updates/inserts/deletes there is a very good chance that the export will be consistent. The myimport script has a mode that uses exception tables and sets constraints to 'FILTERING' mode to record inconsistent data. You'll have to manually extract the inconsistent keys from the exception tables and delete those rows from the affected tables before setting the constraints to ENABLED and dropping the exception tables (I haven't spent the time to get that automated yet), but it works.

Myexport is downloadable free from the IIUG Software Repository. You will also need my utils2_ak package (myexport uses myschema to get a dbexport/dbimport compatible schema file), Jonathan Leffler's sqlcmd package (the default utilities used to unload and load the data), and if you want to use myexports HPLoader options you'll also need Ravi Krishna's myonpload utility. All are in the Repository.

Art

Art S. Kagel
Advanced DataTools (www.advancedatatools.com)
IIUG Board of Directors (a...@iiug.org)

See you at the 2010 IIUG Informix Conference
April 25-28, 2010
Overland Park (Kansas City), KS
www.iiug.org/conf

Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on my employer, Advanced DataTools, the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference. Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.

On Thu, Feb 4, 2010 at 3:03 AM, pretzel <david...@gmail.com> wrote:

Fernando Nunes

unread,

Feb 4, 2010, 7:15:50 AM2/4/10

to

On IDS 11.50.FC5 you have a functionality to do that. It's called
DELAYED APPLY. Please check the release notes.

The state of the secondary (consistency) may not be exactly what you
need. The data is stalled, but you may see "half transactions". So,
technically it would not be considered consistent (to the ANSI eyes).

Regards.

Bartlomiej Lidke

unread,

Feb 4, 2010, 10:04:58 AM2/4/10

to

pretzel <david...@gmail.com> wrote:
> Is there a way to "pause" or "suspend" the replication process so that
> I can extract "unflickering" data from the secondary, and then resume
> replication without having to do a physical restore of a level 0?

I am not sure if I understand you correctly but why don't you simply
put a primary instance into a standard ("onmode -d standard"), do your work
on the secondary and then reconnect from primary ("onmode -d primary
secondary-ifxserver"). just make sure you have enough logical logs on
primary

try this on test environment first :-)

--
butthead

pretzel

unread,

Feb 4, 2010, 12:38:52 PM2/4/10

to

Art,

Thank you for your always helpful comments. Your "brute force"
suggestion is not at all "sketchy", but clear and complete. We don't
have DRAUTO activated, so your plan is perfectly straightforward and
doable. Thank you.

Mr. Nunes,

Thank you for the steer. By "consistent" I was mainly concerned about
referential integrity issues. I will have to do some research to
educate myself on (ANSI) meaning. There is probably a ton of
esoterica that I have not considered.

Mr. Lidke,

That was actually my first inclination, but I don't have two spare
machines to try the experiment. Also, upon further thought, it
occurred to me that changing the mode of primary to standard may do
something, like create a log entry, that terminates the replication
relationship between the two servers, thus making it necessary to do
the "level 0 dance". I admit I have not tried it.

Thanks to all.

Regards,

DG

pretzel

unread,

Feb 4, 2010, 12:41:06 PM2/4/10

to

P.S. Congratulations on your new position, Art.

Bartlomiej Lidke

unread,

Feb 4, 2010, 1:31:06 PM2/4/10

to

pretzel <david...@gmail.com> wrote:
> That was actually my first inclination, but I don't have two spare
> machines to try the experiment.

you can test HDR on a single machine, just remember to use a relative
path to devices/files and use socket connection (not a ipcshm or ipcstr,
HDR will work on sockets only even if it is the same server)

> Also, upon further thought, it
> occurred to me that changing the mode of primary to standard may do
> something, like create a log entry, that terminates the replication
> relationship between the two servers, thus making it necessary to do
> the "level 0 dance". I admit I have not tried it.

after changing primary server to standard, it can fill logical log
entries as long as they don't do a whole loop in a round-robin fashion.
reason for this is simple: secondary must be able to get the logs after
you will connect them again. if a primary will overwrite a logical log
that have not been sent to secondary - HDR will not be reestablished
again

--
butthead

Art Kagel

unread,

Feb 4, 2010, 8:02:32 PM2/4/10

to pretzel, inform...@iiug.org

Thanks.

Art

Art S. Kagel
Advanced DataTools (www.advancedatatools.com)
IIUG Board of Directors (a...@iiug.org)

See you at the 2010 IIUG Informix Conference
April 25-28, 2010
Overland Park (Kansas City), KS
www.iiug.org/conf

Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on my employer, Advanced DataTools, the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference. Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.

On Thu, Feb 4, 2010 at 12:41 PM, pretzel <david...@gmail.com> wrote:

P.S. Congratulations on your new position, Art.

pretzel

unread,

Feb 5, 2010, 3:58:37 PM2/5/10

to

On Feb 4, 2:39 am, Art Kagel <art.ka...@gmail.com> wrote:
> There is a way but it's a bit back handed:
>

> - Checkpoint the primary and wait for the secondary to process the
> checkpoint.
> - Shutdown the secondary (so server A?)
> - Change the sqlhosts file on A so that server A can't connect to the

> primary (so change the host from what it is to 'noserverhere')

> - Bring the secondary online.
> - Take your extract
> - Shutdown the secondary
> - Repair the sqlhosts file
> - Restart the secondary and it will catch up.
>

I am encountering a great mystery, as follows:

I did the folowing:
1) I shutdown the secondary.
2) On the secondary, I edited the sqlhosts file to remove the
servername for the primary.
3) I restarted the secondary.

Here is a portion of the online.log:

11:08:49 IBM Informix Dynamic Server Started.
11:08:49 Requested shared memory segment size rounded from 3423232KB
to 3440640KB
11:08:49 Shared memory segment will use large pages with intimate
shared memory (ISM) if available
11:08:54 Segment locked: addr=10a000000, size=3523215360
11:08:54 Shared memory segment will use large pages with intimate
shared memory (ISM) if available
11:08:54 Segment locked: addr=1dc000000, size=536870912

Fri Feb 5 11:08:57 2010

11:08:57 Event alarms enabled. ALARMPROG = '/opt/informix/etc/
alarmprogram.sh'
11:08:57 Booting Language <c> from module <>
11:08:57 Loading Module <CNULL>
11:08:57 Booting Language <builtin> from module <>
11:08:57 Loading Module <BUILTINNULL>
11:09:02 DR: DRAUTO is 0 (Off)
11:09:02 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
11:09:02 Fast poll /dev/poll enabled.
11:09:02 Requested shared memory segment size rounded from 136KB to
1024KB
11:09:02 IBM Informix Dynamic Server Version 11.50.FC4 Software
Serial Number AAA#B000000
11:09:04 IBM Informix Dynamic Server Initialized -- Shared Memory
Initialized.

11:09:04 Started 1 B-tree scanners.
11:09:04 B-tree scanner threshold set at 5000.
11:09:04 B-tree scanner range scan size set to -1.
11:09:04 B-tree scanner ALICE mode set to 6.
11:09:04 B-tree scanner index compression level set to med.
11:09:04 Physical Recovery Started at Page (1:7119).
11:09:04 Physical Recovery Complete: 0 Pages Examined, 0 Pages
Restored.
11:09:05 DR: Trying to connect to primary server = prod1
11:09:05 DR: Cannot connect to primary server
11:09:05 DR: Turned off on secondary server
11:09:05 Cannot create SMX pipes
11:09:05 Dataskip is now OFF for all dbspaces
11:09:05 Restartable Restore has been ENABLED
11:09:05 Recovery Mode
11:09:12 DR: Secondary server connected
11:09:13 DR: Using default behavior of failure-recovering Secondary
server

11:09:14 DR: Failure recovery from disk in progress ...
11:09:14 Logical Recovery Started.
11:09:14 10 recovery worker threads will be started.
11:09:14 Start Logical Recovery - Start Log 24716, End Log ?
11:09:14 Starting Log Position - 24716 0x6a7018
11:09:21 Started processing open transactions on secondary during
startup
11:09:21 Finished processing open transactions on secondary during
startup.
11:09:21 Logical Log 24716 Complete, timestamp: 0x7f319f2a.
11:09:23 B-tree scanners disabled.
11:09:24 DR: HDR secondary server operational
11:09:25 Checkpoint Completed: duration was 1 seconds.
11:09:25 Fri Feb 5 - loguniq 24717, logpos 0x4018, timestamp:
0x7f319f55 Interval: 25170

11:09:25 Maximum server connections 0
11:09:25 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns
blocked 0, Plog used 230, Llog used 0

11:11:21 Booting Language <spl> from module <>
11:11:21 Loading Module <SPLNULL>
11:19:45 Checkpoint Completed: duration was 0 seconds.
11:19:45 Fri Feb 5 - loguniq 24717, logpos 0xca018, timestamp:
0x7f36a640 Interval: 25171

11:19:45 Maximum server connections 2
11:19:45 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns
blocked 0, Plog used 794, Llog used 0

Note that at 11:09:05, the DR entry indicating failure to connect to
primary, and "turned off" on secondary. This is exactly what I was
trying to accomplish and what I expected. No mystery, so far.

Note at 11:09:12, DR reports the secondary connected. AND, at
11:09:24, the secondary is operational!

How did this happen. I viewed the sqlhosts file, and verified that
there is no entry for the primary inside the sqlhosts file on the
secondary.

Just as a test, I INSERTed a new row in a table on the primary, and
Voila! it replicated to the secondary.

So, this recipe did not suspend replication on the secondary.

What happened?

The only thing I can think of is that although the secondary didn't
know how to get to the primary, the primary still knew how to get to
the secondary, and DR must be smart enough to re-engage, even with
missing primary info on the secondary.

So, do I need to also edit the sqlhosts file on the primary?

Thank you.

DG

pretzel

unread,

Feb 5, 2010, 5:13:03 PM2/5/10

to

> So, do I need to also edit the sqlhosts file on the primary?

But, then I would have to bounce the primary, right? (That's
production, can't do that.)

DG

Nilesh Ozarkar

unread,

Feb 5, 2010, 6:23:57 PM2/5/10

to pretzel, informix-l...@iiug.org, inform...@iiug.org

informix-l...@iiug.org wrote on 02/05/2010 04:13:03 PM:

> From:
>
> pretzel <david...@gmail.com>
>
> To:
>
> inform...@iiug.org
>
> Date:
>
> 02/05/2010 04:16 PM
>
> Subject:
>
> Re: Suspending HDR temporarily?
>
> Sent by:
>
> informix-l...@iiug.org

>
>
> > So, do I need to also edit the sqlhosts file on the primary?
>
>
> But, then I would have to bounce the primary, right? (That's
> production, can't do that.)
>

No, just comment the secondary server (HDR) entry from sqlhosts file on the primary and primary won't be able to reconnect.

Ensure that you have enough logical logs available to the primary server to avoid log recycle, otherwise you will have to do the inevitable ( level-0 dance ) which you wanted to avoid at the first place.

Best thing to do here, is use STOP_APPLY or DELAY_APPLY functionality which got introduced in 11.50.xC5. It is designed to stop (or delay) the replication at a give point and then resume at later stage. more details - http://publib.boulder.ibm.com/infocenter/idshelp/v115/topic/com.ibm.admin.doc/ids_admin_1255.htm

- Nilesh -

pretzel

unread,

Feb 5, 2010, 7:53:30 PM2/5/10

to

On Feb 5, 2:23 pm, Nilesh Ozarkar <nile...@us.ibm.com> wrote:
> No, just comment the secondary server (HDR) entry from sqlhosts file on the

> primary and primary won't be able to reconnect.,

>
>
> Best thing to do here, is use STOP_APPLY or DELAY_APPLY functionality which
> got introduced in 11.50.xC5. It is designed to stop (or delay) the

> replication at a give point and then resume at later stage. more details -http://publib.boulder.ibm.com/infocenter/idshelp/v115/topic/com.ibm.a...
>
> - Nilesh -

Nilesh,

Thank you.

It didn't penetrate my thick skull when Mr. Nunes pointed me to this
feature, in an earlier post.

Unfortunately, although this is exactly what I want (ability to
temporarily suspend replication), we are running 11.5.FC4. So, until
we upgrade, tough luck for us.

I will try editing the sqlhosts file on the primary. I am glad to
hear that, apparently, the sqlhosts info is read dynamically as
required, and not retrieved from some cache that is only refreshed
when server is initialized.

DG