I hope someone can help me, we have been noticing some issues with the DFS
in Prod and its not looking to good.
DFS Setup comprises of 6 servers, 4 referal servers and 2 servers which host
the data, 400gb of data in todal A Node & B Node across 2 data centers - So 3
Servers per Data Center across quite a fast link, i.e. file copy a GB in 3
mins.
In recent weeks users have been reporting that the data isnt consistant
across the 2 sites and after investigations it would appear that they are
correct.
I have run FRSDiag.exe on the 2 servers hosting the data and have the
following outputs:-
A Node
------------------------------------------------------------
FRSDiag v1.7 on 05/01/2008 12:58:38
.\APRAAFS03CV on 2008-01-05 at 12.58.38
------------------------------------------------------------
Checking for errors/warnings in FRS Event Log ....
NtFrs 05/01/2008 12:22:11 Warning 13508 The File Replication Service is
having trouble enabling replication from BPRAAFS03CV to APRAAFS03CV for
e:\production\citrixim using the DNS name BPRAAFS0xcv.domain.local. FRS will
keep retrying. Following are some of the reasons you would see this
warning. [1] FRS can not correctly resolve the DNS name
BPRAAFS0xcv.domain.local from this computer. [2] FRS is not running on
BPRAAFS0xcv.domain.local. [3] The topology information in the Active
Directory for this replica has not yet replicated to all the Domain
Controllers. This event log message will appear once per connection,
After the problem is fixed you will see another event log message indicating
that the connection has been established.
NtFrs 05/01/2008 12:22:10 Warning 13508 The File Replication Service is
having trouble enabling replication from BPRAAFS03CV to APRAAFS03CV for
e:\production\dialect using the DNS name BPRAAFS0xcv.domain.local. FRS will
keep retrying. Following are some of the reasons you would see this
warning. [1] FRS can not correctly resolve the DNS name
BPRAAFS0xcv.domain.local from this computer. [2] FRS is not running on
BPRAAFS0xcv.domain.local. [3] The topology information in the Active
Directory for this replica has not yet replicated to all the Domain
Controllers. This event log message will appear once per connection,
After the problem is fixed you will see another event log message indicating
that the connection has been established.
NtFrs 05/01/2008 12:22:03 Warning 13508 The File Replication Service is
having trouble enabling replication from BPRAAFS03CV to APRAAFS03CV for
e:\production\envmgr using the DNS name BPRAAFS0xcv.domain.local. FRS will
keep retrying. Following are some of the reasons you would see this
warning. [1] FRS can not correctly resolve the DNS name
BPRAAFS0xcv.domain.local from this computer. [2] FRS is not running on
BPRAAFS0xcv.domain.local. [3] The topology information in the Active
Directory for this replica has not yet replicated to all the Domain
Controllers. This event log message will appear once per connection,
After the problem is fixed you will see another event log message indicating
that the connection has been established.
NtFrs 04/01/2008 20:12:12 Warning 13520 The File Replication Service moved
the preexisting files in e:\production\citrixim to
e:\production\citrixim\NtFrs_PreExisting___See_EventLog. The File
Replication Service may delete the files in
e:\production\citrixim\NtFrs_PreExisting___See_EventLog at any time. Files
can be saved from deletion by copying them out of
e:\production\citrixim\NtFrs_PreExisting___See_EventLog. Copying the files
into e:\production\citrixim may lead to name conflicts if the files already
exist on some other replicating partner. In some cases, the File
Replication Service may copy a file from
e:\production\citrixim\NtFrs_PreExisting___See_EventLog into
e:\production\citrixim instead of replicating the file from some other
replicating partner. Space can be recovered at any time by deleting
the files in e:\production\citrixim\NtFrs_PreExisting___See_EventLog.
NtFrs 04/01/2008 19:39:02 Error 13552 The File Replication Service is unable
to add this computer to the following replica set: "PROD|CITRIXIM"
This could be caused by a number of problems such as: -- an invalid
root path, -- a missing directory, -- a missing disk volume,
-- a file system on the volume that does not support NTFS 5.0 The
information below may help to resolve the problem: Computer DNS name is
"apraafs0xcv.domain.local" Replica set member name is
"{75F45136-2EFB-4CA5-B08B-9109145570A7}" Replica set root path is
"e:\production\citrixim" Replica staging directory path is
"e:\frs-staging" Replica working directory path is "c:\windows\ntfrs\jet"
Windows error status code is ERROR_BAD_COMMAND FRS error status code is
FrsErrorResourceInUse Other event log messages may also help determine
the problem. Correct the problem and the service will attempt to restart
replication automatically at a later time.
NtFrs 04/01/2008 19:39:01 Error 13544 The File Replication Service cannot
replicate e:\production\citrixim because it overlaps the replicating
directory e:\production\citrixim.
WARNING: Found Event ID 13508 errors without trailing 13509 ... see above
for (up to) the 3 latest entries!
......... failed 4
Checking for minimum FRS version requirement ... passed
Checking for errors/warnings in ntfrsutl ds ... passed
Checking for Replica Set configuration triggers...
Warning: DirFilter for replica set "PROD|ENVMGR" not set to Default Value!
Current set DirFilter is = "backup"
......... passed with 1 warning(s)
Checking for suspicious file Backlog size...
ERROR : File Backlog TO server "FJLSP\BPRAAFS03CV$" is : 111391 :: Unless
this is due to your schedule, this is a problem!
ERROR : File Backlog TO server "FJLSP\BPRAAFS03CV$" is : 14472 :: Unless
this is due to your schedule, this is a problem!
failed with 2 error(s) and 0 warning(s)
Checking Overall Disk Space and SYSVOL structure (note: integrity is not
checked)... passed
Checking for suspicious inlog entries ... passed
Checking for suspicious outlog entries ... passed
Checking for appropriate staging area size ... passed
Checking for errors in debug logs ...
ERROR on NtFrs_0005.log : "ERROR_ACCESS_DENIED" : <SndCsMain:
3756: 904: S0: 12:43:24> :SR: Cmd 05b93ea8, CxtG 9059a9c6, WS
ERROR_ACCESS_DENIED, To BPRAAFS0xcv.domain.local Len: (396) [SndFail -
Send Penalty]
ERROR on NtFrs_0005.log : "ERROR_ACCESS_DENIED" : <SndCsMain:
4148: 877: S0: 12:43:26> :SR: Cmd 08115008, CxtG eabcaedc, WS
ERROR_ACCESS_DENIED, To BPRAAFS0xcv.domain.local Len: (398) [SndFail - rpc
call]
ERROR on NtFrs_0005.log : "ERROR_ACCESS_DENIED" : <SndCsMain:
4148: 904: S0: 12:43:26> :SR: Cmd 08115008, CxtG eabcaedc, WS
ERROR_ACCESS_DENIED, To BPRAAFS0xcv.domain.local Len: (398) [SndFail -
Send Penalty]
ERROR on NtFrs_0003.log : "EPT_S_NOT_REGISTERED(This may indicate that DNS
returns the IP address of the wrong computer. Check DNS records being
returned, Check if FRS is currently running on the target server. Check if
Ntfrs is registered with the End-Point-Mapper on target server!)" :
<SndCsMain: 456: 883: S0: 12:15:11> ++ ERROR -
EXCEPTION (000006d9) : WStatus: EPT_S_NOT_REGISTERED
ERROR on NtFrs_0003.log : "EPT_S_NOT_REGISTERED(This may indicate that DNS
returns the IP address of the wrong computer. Check DNS records being
returned, Check if FRS is currently running on the target server. Check if
Ntfrs is registered with the End-Point-Mapper on target server!)" :
<SndCsMain: 456: 884: S0: 12:15:11> :SR: Cmd 510d65a0,
CxtG d3b35eff, WS EPT_S_NOT_REGISTERED, To BPRAAFS0xcv.domain.local Len:
(390) [SndFail - rpc exception]
ERROR on NtFrs_0003.log : "EPT_S_NOT_REGISTERED(This may indicate that DNS
returns the IP address of the wrong computer. Check DNS records being
returned, Check if FRS is currently running on the target server. Check if
Ntfrs is registered with the End-Point-Mapper on target server!)" :
<SndCsMain: 456: 904: S0: 12:15:11> :SR: Cmd 510d65a0,
CxtG d3b35eff, WS EPT_S_NOT_REGISTERED, To BPRAAFS0xcv.domain.local Len:
(390) [SndFail - Send Penalty]
Found 244 ERROR_ACCESS_DENIED error(s)! Latest ones (up to 3) listed above
Found 3 EPT_S_NOT_REGISTERED error(s)! Latest ones (up to 3) listed above
......... failed with 247 error entries
Checking NtFrs Service (and dependent services) state...passed
Checking NtFrs related Registry Keys for possible problems...passed
Checking Repadmin Showreps for errors...passed
B Node
------------------------------------------------------------
FRSDiag v1.7 on 05/01/2008 13:03:59
.\BPRAAFS03CV on 2008-01-05 at 13.03.59
------------------------------------------------------------
Checking for errors/warnings in FRS Event Log ....
NtFrs 05/01/2008 12:19:12 Warning 13508 The File Replication Service is
having trouble enabling replication from APRAAFS03CV to BPRAAFS03CV for
e:\production\citrixim using the DNS name apraafs0xcv.domain.local. FRS will
keep retrying. Following are some of the reasons you would see this
warning. [1] FRS can not correctly resolve the DNS name
apraafs0xcv.domain.local from this computer. [2] FRS is not running on
apraafs0xcv.domain.local. [3] The topology information in the Active
Directory for this replica has not yet replicated to all the Domain
Controllers. This event log message will appear once per connection,
After the problem is fixed you will see another event log message indicating
that the connection has been established.
NtFrs 05/01/2008 12:19:12 Warning 13508 The File Replication Service is
having trouble enabling replication from APRAAFS03CV to BPRAAFS03CV for
e:\production\dialect using the DNS name apraafs0xcv.domain.local. FRS will
keep retrying. Following are some of the reasons you would see this
warning. [1] FRS can not correctly resolve the DNS name
apraafs0xcv.domain.local from this computer. [2] FRS is not running on
apraafs0xcv.domain.local. [3] The topology information in the Active
Directory for this replica has not yet replicated to all the Domain
Controllers. This event log message will appear once per connection,
After the problem is fixed you will see another event log message indicating
that the connection has been established.
NtFrs 05/01/2008 12:19:10 Warning 13508 The File Replication Service is
having trouble enabling replication from APRAAFS03CV to BPRAAFS03CV for
e:\production\envmgr using the DNS name apraafs0xcv.domain.local. FRS will
keep retrying. Following are some of the reasons you would see this
warning. [1] FRS can not correctly resolve the DNS name
apraafs0xcv.domain.local from this computer. [2] FRS is not running on
apraafs0xcv.domain.local. [3] The topology information in the Active
Directory for this replica has not yet replicated to all the Domain
Controllers. This event log message will appear once per connection,
After the problem is fixed you will see another event log message indicating
that the connection has been established.
NtFrs 05/01/2008 12:13:55 Error 13504 The File Replication Service stopped
without cleaning up.
NtFrs 04/01/2008 20:28:23 Warning 13522 The File Replication Service paused
because the staging area is full. Staging files are used to replicate
created, deleted or modified files between partners. FRS will automatically
remove least recently used files from this staging area (in the order of the
longest time since the last access) until the amount of space in use has
dropped below 60 of the staging space-limit, after which replication will
resume. If this condition occurs frequently: Confirm that all
direct outbound replication partners receiving updates from this member are
online and receiving udpates. Verify that the replication schedule for
receiving partners is open or "on" for a sufficient window of time to
accomodate the number of files being replicated. Consider increasing
the staging area to improve system performance. The current value
of the staging space limit is 10240000 KB. To change the staging space
limit, run regedit: Click on Start -> Run and type REGEDT . Expand
HKEY_LOCAL_MACHINE, SYSTEM, CurrentControlSet, Services, NtFrs, Parameters,
and the value "Staging Space Limit in KB".
NtFrs 04/01/2008 19:46:44 Error 13552 The File Replication Service is unable
to add this computer to the following replica set: "PROD|CITRIXIM"
This could be caused by a number of problems such as: -- an invalid
root path, -- a missing directory, -- a missing disk volume,
-- a file system on the volume that does not support NTFS 5.0 The
information below may help to resolve the problem: Computer DNS name is
"BPRAAFS0xcv.domain.local" Replica set member name is
"{BE62DCAD-3149-4C63-B050-395A12C5D72F}" Replica set root path is
"e:\production\citrixim" Replica staging directory path is
"e:\frs-staging" Replica working directory path is "c:\windows\ntfrs\jet"
Windows error status code is ERROR_BAD_COMMAND FRS error status code is
FrsErrorResourceInUse Other event log messages may also help determine
the problem. Correct the problem and the service will attempt to restart
replication automatically at a later time.
NtFrs 04/01/2008 19:46:43 Error 13544 The File Replication Service cannot
replicate e:\production\citrixim because it overlaps the replicating
directory e:\production\citrixim.
WARNING: Found Event ID 13508 errors without trailing 13509 ... see above
for (up to) the 3 latest entries!
......... failed 5
Checking for minimum FRS version requirement ... passed
Checking for errors/warnings in ntfrsutl ds ... passed
Checking for Replica Set configuration triggers...
Warning: DirFilter for replica set "PROD|ENVMGR" not set to Default Value!
Current set DirFilter is = "backup"
......... passed with 1 warning(s)
Checking for suspicious file Backlog size...
ERROR : File Backlog TO server "FJLSP\APRAAFS03CV$" is : 61354 :: Unless
this is due to your schedule, this is a problem!
ERROR : File Backlog TO server "FJLSP\APRAAFS03CV$" is : 3293 :: Unless
this is due to your schedule, this is a problem!
ERROR : File Backlog TO server "FJLSP\APRAAFS03CV$" is : 468 :: Unless
this is due to your schedule, this is a problem!
failed with 3 error(s) and 0 warning(s)
Checking Overall Disk Space and SYSVOL structure (note: integrity is not
checked)... passed
Checking for suspicious inlog entries ... passed
Checking for suspicious outlog entries ... passed
Checking for appropriate staging area size ... passed
Checking for errors in debug logs ...
ERROR on NtFrs_0004.log : "ERROR_ACCESS_DENIED" : <SndCsMain:
5020: 904: S0: 12:44:34> :SR: Cmd 07ffcec8, CxtG 60ba629b, WS
ERROR_ACCESS_DENIED, To apraafs0xcv.domain.local Len: (390) [SndFail -
Send Penalty]
ERROR on NtFrs_0004.log : "ERROR_ACCESS_DENIED" : <SndCsMain:
5020: 877: S0: 12:44:40> :SR: Cmd 055f3f20, CxtG 1830e0b5, WS
ERROR_ACCESS_DENIED, To apraafs0xcv.domain.local Len: (398) [SndFail - rpc
call]
ERROR on NtFrs_0004.log : "ERROR_ACCESS_DENIED" : <SndCsMain:
5020: 904: S0: 12:44:40> :SR: Cmd 055f3f20, CxtG 1830e0b5, WS
ERROR_ACCESS_DENIED, To apraafs0xcv.domain.local Len: (398) [SndFail -
Send Penalty]
ERROR on NtFrs_0004.log : "EPT_S_NOT_REGISTERED(This may indicate that DNS
returns the IP address of the wrong computer. Check DNS records being
returned, Check if FRS is currently running on the target server. Check if
Ntfrs is registered with the End-Point-Mapper on target server!)" :
<SndCsMain: 5068: 883: S0: 12:17:59> ++ ERROR -
EXCEPTION (000006d9) : WStatus: EPT_S_NOT_REGISTERED
ERROR on NtFrs_0004.log : "EPT_S_NOT_REGISTERED(This may indicate that DNS
returns the IP address of the wrong computer. Check DNS records being
returned, Check if FRS is currently running on the target server. Check if
Ntfrs is registered with the End-Point-Mapper on target server!)" :
<SndCsMain: 5068: 884: S0: 12:17:59> :SR: Cmd 055f1d50,
CxtG 6b473bcc, WS EPT_S_NOT_REGISTERED, To apraafs0xcv.domain.local Len:
(394) [SndFail - rpc exception]
ERROR on NtFrs_0004.log : "EPT_S_NOT_REGISTERED(This may indicate that DNS
returns the IP address of the wrong computer. Check DNS records being
returned, Check if FRS is currently running on the target server. Check if
Ntfrs is registered with the End-Point-Mapper on target server!)" :
<SndCsMain: 5068: 904: S0: 12:17:59> :SR: Cmd 055f1d50,
CxtG 6b473bcc, WS EPT_S_NOT_REGISTERED, To apraafs0xcv.domain.local Len:
(394) [SndFail - Send Penalty]
ERROR on NtFrs_0004.log : "WS RPC_S_SERVER_TOO_BUSY(The target server may
be overwhelmed, memory or CPU-wise. Is the target server a very busy
bridgehead?)" : <SndCsMain: 5068: 904: S0: 12:17:00>
:SR: Cmd 043fbee8, CxtG 73229b7a, WS RPC_S_SERVER_TOO_BUSY, To
apraafs0xcv.domain.local Len: (406) [SndFail - Send Penalty]
ERROR on NtFrs_0004.log : "WS RPC_S_SERVER_TOO_BUSY(The target server may
be overwhelmed, memory or CPU-wise. Is the target server a very busy
bridgehead?)" : <SndCsMain: 5068: 884: S0: 12:17:01>
:SR: Cmd 05858c78, CxtG 26119944, WS RPC_S_SERVER_TOO_BUSY, To
apraafs0xcv.domain.local Len: (388) [SndFail - rpc exception]
ERROR on NtFrs_0004.log : "WS RPC_S_SERVER_TOO_BUSY(The target server may
be overwhelmed, memory or CPU-wise. Is the target server a very busy
bridgehead?)" : <SndCsMain: 5068: 904: S0: 12:17:01>
:SR: Cmd 05858c78, CxtG 26119944, WS RPC_S_SERVER_TOO_BUSY, To
apraafs0xcv.domain.local Len: (388) [SndFail - Send Penalty]
Found 214 ERROR_ACCESS_DENIED error(s)! Latest ones (up to 3) listed above
Found 39 EPT_S_NOT_REGISTERED error(s)! Latest ones (up to 3) listed above
Found 8 WS RPC_S_SERVER_TOO_BUSY error(s)! Latest ones (up to 3) listed above
......... failed with 261 error entries
Checking NtFrs Service (and dependent services) state...passed
Checking NtFrs related Registry Keys for possible problems...passed
Checking Repadmin Showreps for errors...passed
Can anyone help me determin what these outputs are saying?
I also am running MS Ultrasound which keeps mentioning about Morphed
Directories exist, Very old back logged files, FRS replica set in State
Stopped, Ive Up'd the FRS Staging to 20gb.
Also how can I check the consistancy of the c:\windows\ntfrs\jet\ntfrs.jdb
PLease get back with anything that might assist me with regards to
corrective actions
Many Thanks
--
Deano