Basically.. at random times.. if i create a new protection group for our
server called PSTDC (or ServerDC).. it will time out and stop after so many
minutes.. never an exact time..
The error is "DPM failed to communicate with the protection agent on
pstdc.domain.local because the agent is not responding" (ID 43 Details:
Internal error code 0x8099090E
First off.. does anyone know what this internal error code means? So far
noone has had any real answers on this...
All of my other 3 servers seem to communicate just fine.
PSTDC is a GC/DC/DNS/DCHP/SQL2k5-Config Manager/SQL 2k-Dynamics/file server.
We have a "backup" server for the GC/DC/DNS/DCHP roles.
I've verified i can access both servers during this time.. ping works and
everything else.. no errors related to this in event viewer.
I've tried uninstalling the agent, reinstalling it (from the DPM server,
backup01).. I've also tried installing the agent from the setup.exe on
PSTDC...
On PSTDC.. app log i have several errors.. note.. ill post them from start
to finish if this helps.. the start time was 7:03pm:
At 7:17.. i have this error.. related to SMS server.. it only appears when
the backup is running (i'm not sure if its normal or not).. all these are
errors.. not warnings:
Eventid 5438 Source SMS Server:
On 11/15/2007 7:17:12 PM, component SMS_MP_CONTROL_MANAGER on computer PSTDC
reported: MP Control Manager detected MP is not responding to HTTP requests.
The http error is 2147500037.
Possible cause: MP service is not started or not responding.
Solution: Manually restart the SMS Agent Host service on the MP.
Possible cause: IIS service is not responding.
Solution: Manually restart the W3SVC service on the MP.
For more information, refer to Microsoft Knowledge Base article 838891.
--------
At 7:40pm.. There is an ESENT database error in the app log:
Event Type: Error
Event Source: ESENT
Event Category: Logging/Recovery
Event ID: 215
Date: 11/15/2007
Time: 7:40:46 PM
User: N/A
Computer: PSTDC
Description:
tcpsvcs (1216) The backup has been stopped because it was halted by the
client or the connection with the client failed.
-----------
At 7:51pm (backup still going):
Eventid 85.. source: DPMRA:
Event Type: Error
Event Source: DPMRA
Event Category: None
Event ID: 85
Date: 11/15/2007
Time: 7:51:38 PM
User: NT AUTHORITY\SYSTEM
Computer: PSTDC
Description:
A DPM agent failed to communicate with the DPM service on backup01.pst.local
because of a communication error. Make sure that backup01.pst.local is
remotely accessible from the computer running the DPM agent. If a firewall is
enabled on backup01.pst.local, make sure that it is not blocking requests
from the computer running the DPM agent (Error code: 0x800706ba, full name:
backup01.pst.local).
----------
Backup fails around 8:04pm.
Even after this, i still see the esent error and the sms server error (but
these werent here prior to starting the backup tests).
I dont see much in the system area.. except a bunch of kerberos eventid 4,
source kerberos errors like this:
Event Type: Error
Event Source: Kerberos
Event Category: None
Event ID: 4
Date: 11/15/2007
Time: 7:51:24 PM
User: N/A
Computer: PSTDC
Description:
The kerberos client received a KRB_AP_ERR_MODIFIED error from the server
PST037$. The target name used was RPCSS/PST028.pst.local. This indicates
that the password used to encrypt the kerberos service ticket is different
than that on the target server. Commonly, this is due to identically named
machine accounts in the target realm (PST.LOCAL), and the client realm.
Please contact your system administrator.
There are no DNS errors either.. there are a few DHCP database errors, but i
dont believe these to be the issue.
-------
Finally.. on backup01.. dpm alerts.. i do see the info pane at 7:03 when i
started has this:
eventid 1, DPM-EM event:
Event Type: Information
Event Source: DPM-EM
Event Category: None
Event ID: 1
Date: 11/15/2007
Time: 7:03:16 PM
User: N/A
Computer: BACKUP01
Description:
The replica of D:\ on pstdc.pst.local is being created. After the initial
copy is made, only incremental changes are synchronized. (ID: 3162)
DPM ID: 2^|^BACKUP01^|^Replica creation in
progress^|^DPM^|^Backup^|^pstdc.pst.local^|^57e3f31e-b545-4ca0-8a94-645ce6e445cf
But oddly, at 8:04 when it fails.. i see:
EventID 2, DPM-EM
Event Type: Information
Event Source: DPM-EM
Event Category: None
Event ID: 2
Date: 11/15/2007
Time: 8:04:37 PM
User: N/A
Computer: BACKUP01
Description:
The replica of D:\ on pstdc.pst.local is being created. After the initial
copy is made, only incremental changes are synchronized. (ID: 3162)
DPM ID: 2^|^BACKUP01^|^Replica creation in
progress^|^DPM^|^Backup^|^pstdc.pst.local^|^57e3f31e-b545-4ca0-8a94-645ce6e445cf
These general info ones are followed by the eventid 1, dpm-em failure report
about the replica being inconsistent at 8:04pm as well.
Any thoughts would be great.. this is the big picture as of now.
--
Thanks,
Kapil
This posting is provided "AS IS" with no warranties, and confers no rights.
"markm75" <mar...@discussions.microsoft.com> wrote in message
news:87D1C567-553F-412E...@microsoft.com...
I've been up and down checking for communication errors on this server
and the other.. i can find none.. everything seems fine.. is there
another test i should do to find what may be going on?