Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

RMAN not retrying on media manager errors

309 views
Skip to first unread message

Ian Chard

unread,
Apr 6, 2010, 10:00:48 AM4/6/10
to
Hi,

I'm using Oracle 10g, RMAN and TDPO (the TSM client for Oracle). I'm
trying to do a PITR to a non-production machine, but RMAN is intolerant
of any media manager problems that crop up, so things like this

ORA-19511: Error received from media manager layer, error text:
ANS1017E (RC-50) Session rejected: TCP/IP connection failure

and this

ORA-19511: Error received from media manager layer, error text:
ANS1314E (RC14) File data currently unavailable on server

result in an immediate 'failover to previous backup'. The first error
was caused by a transient network problem; I suspect the second was just
bad luck as the file was being reclaimed by the TSM server when TDPO
asked for it.

Both these errors would have gone away if RMAN had tried again, so is
there any way I can tell it to retry on error? If not, is there
something else I could do to improve the situation?

Thanks
- Ian

--
Ian Chard, Senior Unix and Network Gorilla | E: ian....@sers.ox.ac.uk
Systems and Electronic Resources Service | T: 80587 / (01865) 280587
Oxford University Library Services | F: (01865) 242287

Robert Klemme

unread,
Apr 7, 2010, 2:41:39 PM4/7/10
to
On 06.04.2010 16:00, Ian Chard wrote:
> I'm using Oracle 10g, RMAN and TDPO (the TSM client for Oracle). I'm
> trying to do a PITR to a non-production machine, but RMAN is intolerant
> of any media manager problems that crop up, so things like this
>
> ORA-19511: Error received from media manager layer, error text:
> ANS1017E (RC-50) Session rejected: TCP/IP connection failure
>
> and this
>
> ORA-19511: Error received from media manager layer, error text:
> ANS1314E (RC14) File data currently unavailable on server
>
> result in an immediate 'failover to previous backup'. The first error
> was caused by a transient network problem; I suspect the second was just
> bad luck as the file was being reclaimed by the TSM server when TDPO
> asked for it.
>
> Both these errors would have gone away if RMAN had tried again, so is
> there any way I can tell it to retry on error? If not, is there
> something else I could do to improve the situation?

I have worked with a Ora 10g, RMAN and Tivoli on Linux a few years ago.
We had so frequent hangups (error message buried somewhere and RMAN
just sat there and did nothing) that I created a DB metric to detect
that situation. The solution then was to manually kill RMAN. :-(
That's of course not a solution to your problem but might be an
indication that this kind of integration does not work too well although
it sounds great on paper. Does anybody else have experience with that
combination?

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

0 new messages