Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

DPM failed to communicate with the protection agent on ServerDC...

882 views
Skip to first unread message

markm75

unread,
Nov 15, 2007, 9:14:03 PM11/15/07
to
I'm reposting this question.. as i have more details on this issue now...

Basically.. at random times.. if i create a new protection group for our
server called PSTDC (or ServerDC).. it will time out and stop after so many
minutes.. never an exact time..

The error is "DPM failed to communicate with the protection agent on
pstdc.domain.local because the agent is not responding" (ID 43 Details:
Internal error code 0x8099090E

First off.. does anyone know what this internal error code means? So far
noone has had any real answers on this...

All of my other 3 servers seem to communicate just fine.

PSTDC is a GC/DC/DNS/DCHP/SQL2k5-Config Manager/SQL 2k-Dynamics/file server.

We have a "backup" server for the GC/DC/DNS/DCHP roles.

I've verified i can access both servers during this time.. ping works and
everything else.. no errors related to this in event viewer.

I've tried uninstalling the agent, reinstalling it (from the DPM server,
backup01).. I've also tried installing the agent from the setup.exe on
PSTDC...

On PSTDC.. app log i have several errors.. note.. ill post them from start
to finish if this helps.. the start time was 7:03pm:

At 7:17.. i have this error.. related to SMS server.. it only appears when
the backup is running (i'm not sure if its normal or not).. all these are
errors.. not warnings:

Eventid 5438 Source SMS Server:

On 11/15/2007 7:17:12 PM, component SMS_MP_CONTROL_MANAGER on computer PSTDC
reported: MP Control Manager detected MP is not responding to HTTP requests.
The http error is 2147500037.

Possible cause: MP service is not started or not responding.
Solution: Manually restart the SMS Agent Host service on the MP.

Possible cause: IIS service is not responding.
Solution: Manually restart the W3SVC service on the MP.

For more information, refer to Microsoft Knowledge Base article 838891.
--------


At 7:40pm.. There is an ESENT database error in the app log:

Event Type: Error
Event Source: ESENT
Event Category: Logging/Recovery
Event ID: 215
Date: 11/15/2007
Time: 7:40:46 PM
User: N/A
Computer: PSTDC
Description:
tcpsvcs (1216) The backup has been stopped because it was halted by the
client or the connection with the client failed.

-----------

At 7:51pm (backup still going):

Eventid 85.. source: DPMRA:

Event Type: Error
Event Source: DPMRA
Event Category: None
Event ID: 85
Date: 11/15/2007
Time: 7:51:38 PM
User: NT AUTHORITY\SYSTEM
Computer: PSTDC
Description:
A DPM agent failed to communicate with the DPM service on backup01.pst.local
because of a communication error. Make sure that backup01.pst.local is
remotely accessible from the computer running the DPM agent. If a firewall is
enabled on backup01.pst.local, make sure that it is not blocking requests
from the computer running the DPM agent (Error code: 0x800706ba, full name:
backup01.pst.local).

----------


Backup fails around 8:04pm.


Even after this, i still see the esent error and the sms server error (but
these werent here prior to starting the backup tests).


I dont see much in the system area.. except a bunch of kerberos eventid 4,
source kerberos errors like this:

Event Type: Error
Event Source: Kerberos
Event Category: None
Event ID: 4
Date: 11/15/2007
Time: 7:51:24 PM
User: N/A
Computer: PSTDC
Description:
The kerberos client received a KRB_AP_ERR_MODIFIED error from the server
PST037$. The target name used was RPCSS/PST028.pst.local. This indicates
that the password used to encrypt the kerberos service ticket is different
than that on the target server. Commonly, this is due to identically named
machine accounts in the target realm (PST.LOCAL), and the client realm.
Please contact your system administrator.


There are no DNS errors either.. there are a few DHCP database errors, but i
dont believe these to be the issue.
-------

Finally.. on backup01.. dpm alerts.. i do see the info pane at 7:03 when i
started has this:

eventid 1, DPM-EM event:

Event Type: Information
Event Source: DPM-EM
Event Category: None
Event ID: 1
Date: 11/15/2007
Time: 7:03:16 PM
User: N/A
Computer: BACKUP01
Description:

The replica of D:\ on pstdc.pst.local is being created. After the initial
copy is made, only incremental changes are synchronized. (ID: 3162)


DPM ID: 2^|^BACKUP01^|^Replica creation in
progress^|^DPM^|^Backup^|^pstdc.pst.local^|^57e3f31e-b545-4ca0-8a94-645ce6e445cf


But oddly, at 8:04 when it fails.. i see:
EventID 2, DPM-EM

Event Type: Information
Event Source: DPM-EM
Event Category: None
Event ID: 2
Date: 11/15/2007
Time: 8:04:37 PM
User: N/A
Computer: BACKUP01
Description:

The replica of D:\ on pstdc.pst.local is being created. After the initial
copy is made, only incremental changes are synchronized. (ID: 3162)


DPM ID: 2^|^BACKUP01^|^Replica creation in
progress^|^DPM^|^Backup^|^pstdc.pst.local^|^57e3f31e-b545-4ca0-8a94-645ce6e445cf


These general info ones are followed by the eventid 1, dpm-em failure report
about the replica being inconsistent at 8:04pm as well.


Any thoughts would be great.. this is the big picture as of now.

Kapil Malhotra [MSFT]

unread,
Nov 20, 2007, 7:55:50 AM11/20/07
to
The error means that the DPM agent on the production server wasn't able to
send response to commands from the DPM server. It looks like name resolution
failed in the middle somewhere. The event indicates that the production
server got an RPC communication error when sending a response to the DPM
server.

--
Thanks,

Kapil
This posting is provided "AS IS" with no warranties, and confers no rights.

"markm75" <mar...@discussions.microsoft.com> wrote in message
news:87D1C567-553F-412E...@microsoft.com...

mark...@gmail.com

unread,
Nov 21, 2007, 12:08:32 PM11/21/07
to
On Nov 20, 7:55 am, "Kapil Malhotra [MSFT]"

<kap...@online.microsoft.com> wrote:
> The error means that the DPM agent on the production server wasn't able to
> send response to commands from the DPM server. It looks like name resolution
> failed in the middle somewhere. The event indicates that the production
> server got an RPC communication error when sending a response to the DPM
> server.
>
> --
> Thanks,
>
> Kapil
> This posting is provided "AS IS" with no warranties, and confers no rights.
>
> "markm75" <mark...@discussions.microsoft.com> wrote in message
> > progress^|^DPM^|^Backup^|^pstdc.pst.local^|^57e3f31e-b545-4ca0-8a94-645ce6e-445cf

>
> > But oddly, at 8:04 when it fails.. i see:
> > EventID 2, DPM-EM
>
> > Event Type: Information
> > Event Source: DPM-EM
> > Event Category: None
> > Event ID: 2
> > Date: 11/15/2007
> > Time: 8:04:37 PM
> > User: N/A
> > Computer: BACKUP01
> > Description:
>
> > The replica of D:\ on pstdc.pst.local is being created. After the initial
> > copy is made, only incremental changes are synchronized. (ID: 3162)
>
> > DPM ID: 2^|^BACKUP01^|^Replica creation in
> > progress^|^DPM^|^Backup^|^pstdc.pst.local^|^57e3f31e-b545-4ca0-8a94-645ce6e-445cf

>
> > These general info ones are followed by the eventid 1, dpm-em failure
> > report
> > about the replica being inconsistent at 8:04pm as well.
>
> > Any thoughts would be great.. this is the big picture as of now.- Hide quoted text -
>
> - Show quoted text -

I've been up and down checking for communication errors on this server
and the other.. i can find none.. everything seems fine.. is there
another test i should do to find what may be going on?

0 new messages