We lost a Windows 2003 SP2 domain controller last week (dc1.domain.com), and
I've been trying to get everything cleaned up and stabilized ever since.
This DC happened to be the server we used as our primary server for internal
DNS. We have other servers with DNS installed, and I have changed our DHCP
scope and manually configured our static clients so they look to the
secondary DNS servers. But is there anything else I need to be cleaning up
in DNS? For example, I am in the process of cleaning up AD metadata on the
failed machine. Are there similar steps I need to take to clean up DNS?
Should I just manually delete any record that references the old
dc1.domain.com machine?
One thing I noticed specifically: We replicate DNS between us and a trusted
site. That site for some reason seems no longer able to get zone updates.
I have to refresh it manually. And I notice that the Start of Authority on
that remote domain references the failed DC on our domain. It's grayed out,
so I can't seem to change it. Should I rebuild the zone, or is there some
way I can just tell it to look to a different DNS server for SOA?
We have noticed that some things seem to take longer despite my best
efforts. Logon scripts take longer. Outlook seems to timeout on some
machines when launched. Etc. Usually a reboot fixes it, but some people
are getting kind of tired.
Anyway, any help is appreciated.
Thanks
See the following article about:
http://support.microsoft.com/kb/555846/en-us
On the DNS zones check that the records from the failed machine are complete
removed also from the zone properties.
Best regards
Meinolf Weber
Disclaimer: This posting is provided "AS IS" with no warranties, and confers
no rights.
** Please do NOT email, only reply to Newsgroups
** HELP us help YOU!!! http://www.blakjak.demon.co.uk/mul_crss.htm
This DC happened to be the server we used as our primary server for internal DNS.
Nonetheless, you are using Active Directory integrated zones, right?
One thing I noticed specifically: We replicate DNS between us and a trusted site.
How?
Should I rebuild the zone, or is there some way I can just tell it to look to a different DNS server for SOA?
What, in its configuration, makes you think that the remote server
is looking to the contents of the SOA resource record to determine
whence it should go asking for database content? That's an unusual
configuration.
>>>>This DC happened to be the server we used as our primary server for internal DNS.
>>Nonetheless, you are using Active Directory integrated zones, right?
Yes, AD integrated.
>>How?
Not entirely certain. There is a two-way trust between the local domain (FTW) and the remote domain (NO). DCs in each domain are acting as DNS servers, and each has zone records for the other's domain. In other words, FTWDC2 has a DNS zone for ftworth.com and neworleans.com. NODC1 has DNS zone for neworleans.com and ftworth.com.
Since the crash, I have to remote control NODC1 and
"reload" the ftworth.com zone.
Well, I'm not sure that's what it is doing. But inside the ftworth.com zone on NODC1, there is a record of Type SOA that references the crashced domain controller in Ft Worth, namely, FTWDC1.ftworth.com.
Hope that all makes some sense. Thanks for the
response.
If you are saying you have a crashed DC that was never rebuilt, and it was removed from the infrastructure and AD database with a metadata cleanup (http://support.microsoft.com/kb/216498), then you also have to remove any reference to it as an NS. Go into the zone properties, and delete it.
If your partner company is simply using zone transfers, they may still be referencing the DC as a Master. Tell them to update their secondary zone to an existing DC/DNS server on your end.
No need to rebuild the zone. If it was AD integrated (as you indicated in your reply to Jonathan), the zone data is replicated on all other DC/DNS server in its replication scope.
As for outlook timing out, it's indicating it cannot contact a Global Catalog. How many DCs do you have, and are all of them GCs? Assuming you have one domain in your forest, it is recommended that all DCs are GCs.
Was the failed DC a GC? If it was not rebuilt and simply left unrepaired, you have to run the metadata cleanup procedure. You also have to insure it is deleted out of Sites and Services, as well as transfer FSMO roles it held to other DCs. Meinolf gave you a link to guide you in all of this, however you have not responded regarding whether you followed that, or if even if the DC was rebuilt or not. What I appear to be seeing is replies regarding symptoms, but nothing else that we can use to specifically help you with.
So let's try to organize what we need to help you:
1. How many DCs do you have?
2. How many of them are GCs?
3. Was the failed DC rebuilt or a Metadata Cleanup ran?
4. Is the failed DC listed as a DC in Exchange?
5. What version of Exchange?
6. How many domains in your forest?
7. Post an ipconfig /all of a sample workstation (such as one having Outlook issues) and of your DCs, please.
8. Is your partner organization using zone transfers? If you do not know the answer to this question, who set all of this up?
Thank you,
Ace
- DC1 crashed and will never be rebuilt.
- We had 3 domain controllers: DC1, DC2, and DC3. We now only have two
(DC2 and DC3). I will build DC4 to replace DC1 after I get all this cleaned
up.
- I seized FSMO roles to DC3. Except for Infrastructure, which was already
on DC2.
- DC1 and DC3 were both Global Catalogs. Dunno why we had two.
- I confirmed DC3 is an active GC by using ldp.exe and connecting to DC3 via
port 389 and 3268.
- Today I followed the metadata cleanup per KB 216498 and it appears to have
been successful. Don't know if this has yet resolved any latency issues.
Hopefully so.
Now, lingering questions:
- My main question now has to do with SRV records in our remaining DNS
servers that still reference dc1.domain.com. Apparently metadata cleanup
does not remove SRV records, although it looks like it cleaned a lot of
other stuff out of DNS that referenced the old DC (including CNAME). Can /
Should I manually delete the remaining SRV records? I suppose they don't
hurt, because I got rid of all the A records I could find, so I'm not sure
how the SRVs would resolve to anything at this point.
- I don't know if the failed DC was listed in Exchange 2003 SP2. I presume
it was, but don't know offhand how to check.
- Three domains in forest.
- My predecessor(s) set most of this up.
Thanks again. Hopefully the metadata cleanup has fixed a few things. I'll
post a follow up tomorrow. Would like to know about the SRV records,
though.
The DNS zone at the remote site is a Secondary DNS zone.
Therefore I presume it is relatively safe to delete and rebuild? It is the
one with the SOA record pointing to the failed DC.
Thanks.
Yep!
Ace
Since you have more than one domain in the forest, you should have minimally two DCs per domain to insure the IM role is not on the same DC as the GC. You can never have enough GCs.
> - I confirmed DC3 is an active GC by using ldp.exe and connecting to DC3 via
> port 389 and 3268.
> - Today I followed the metadata cleanup per KB 216498 and it appears to have
> been successful. Don't know if this has yet resolved any latency issues.
> Hopefully so.
Also delete the server object for the failed DC in Sites and Services.
>
> Now, lingering questions:
>
> - My main question now has to do with SRV records in our remaining DNS
> servers that still reference dc1.domain.com. Apparently metadata cleanup
> does not remove SRV records, although it looks like it cleaned a lot of
> other stuff out of DNS that referenced the old DC (including CNAME).
Which CNAME?
As for cleaning up SRV records, try this:
Goto c:\windows\system32\config
Rename the netlogon.dns and netlogon.bak files addiing .old to the end.
CMD prompt:
ipconfig /registerdns
net stop netlogon
net start netlogon
Wait about a minute, refresh the DNS console, and check the SRVs. Also check the LdapIpAddress (the one that says 'same as parent') and GcIpAddress (the entry under _gc._msdcs.domain.local.
If it doesn't clean up, then manually delete the records. Go into zone properties and manually delete the old DC references in Namesrevers tab.
> Can /
> Should I manually delete the remaining SRV records? I suppose they don't
> hurt, because I got rid of all the A records I could find, so I'm not sure
> how the SRVs would resolve to anything at this point.
Yes, but follow the process above, first.
>
> - I don't know if the failed DC was listed in Exchange 2003 SP2. I presume
> it was, but don't know offhand how to check.
ESM - Server properties, Directory Services Tab.
Also check the Offline Address Book.
Is Exchange installed on a DC?? (I hope not).
>
> - Three domains in forest.
Three domains in the forest? Then do you have a minimum of two DCs per domain? If not, then you MUST. This is because the GC role cannot be on the same server holding the IM role.
>
> - My predecessor(s) set most of this up.
>
> Thanks again. Hopefully the metadata cleanup has fixed a few things. I'll
> post a follow up tomorrow. Would like to know about the SRV records,
> though.
>
What about the ipconfigs? If you are in a private network, it's safe to post them., You can change the names and domain name, but keep the name format, and be careful not to change IP subnets, etc, or they will alert everyone that you have a problem. Ipconfigs are extremely informative. They tell us alot of info, such as primary dns sufffix, (if single label name or not, or if mismatched to the DNS domain name), DNS entries (should point to itself first, then a partner DC, but no more than two or they become superfulous due to the client side resolver timing out before it ever gets to the third entry), if the DCs are multhomed meaning if more than one NIC that are not teamed, multiple IPs and/or if RRAS is installed), which is nto recommended, if using an ISP, etc.
How about event log errors? Post any eventID# and Source names.
Ac
Thanks
> Since you have more than one domain in the forest, you should have
> minimally two DCs per domain to insure the IM role is not on the same DC
> as the GC. You can never have enough GCs.
Yes, we have at least two domain controllers per domain. We also have two
Global Catalogs � even within the same site. It thought we only needed one
GC per site, but I could be mistaken.
>> - I confirmed DC3 is an active GC by using ldp.exe and connecting to DC3
>> via
>> port 389 and 3268.
>> - Today I followed the metadata cleanup per KB 216498 and it appears to
>> have
>> been successful. Don't know if this has yet resolved any latency issues.
>> Hopefully so.
> Also delete the server object for the failed DC in Sites and Services.
Yes, I did that.
>>
>> Now, lingering questions:
>>
>> - My main question now has to do with SRV records in our remaining DNS
>> servers that still reference dc1.domain.com. Apparently metadata cleanup
>> does not remove SRV records, although it looks like it cleaned a lot of
>> other stuff out of DNS that referenced the old DC (including CNAME).
> Which CNAME?
Metadata cleanup appears to have removed CNAME of the failed server in
ftworth.com _msdcs folder. Other DCs are still there.
> As for cleaning up SRV records, try this:
> Goto c:\windows\system32\config
> Rename the netlogon.dns and netlogon.bak files addiing .old to the end.
> CMD prompt:
> ipconfig /registerdns
> net stop netlogon
> net start netlogon
> Wait about a minute, refresh the DNS console, and check the SRVs.
> Also check the LdapIpAddress (the one that says 'same as parent') and >
> GcIpAddress (the entry under _gc._msdcs.domain.local.
> If it doesn't clean up, then manually delete the records. Go into
> zone properties and manually delete the old DC references in
> Namesrevers tab.
I will do this today and post results. Thank you very much for these
instructions.
>> Can /
>> Should I manually delete the remaining SRV records? I suppose they don't
>> hurt, because I got rid of all the A records I could find, so I'm not
>> sure
>> how the SRVs would resolve to anything at this point.
> Yes, but follow the process above, first.
>>
>> - I don't know if the failed DC was listed in Exchange 2003 SP2. I
>> presume
>> it was, but don't know offhand how to check.
> ESM - Server properties, Directory Services Tab.
> Also check the Offline Address Book.
Yes, failed DC is listed in Directory Access in ESM, but it looks like it
moved it to the bottom of the list. Here�s what it looks like:
Domain Ctrlr Site Domain Type
DC2 FtWorth ftworth.com Config (auto)
DC3 FtWorth ftworth.com DC (auto)
DC2 FtWorth ftworth.com DC (auto)
DC3 FtWorth ftworth.com GC
DC1 Unknown ftworth.com GC
> Is Exchange installed on a DC?? (I hope not).
No, Exchange is not installed on a DC.
>> - Three domains in forest.
> Three domains in the forest? Then do you have a minimum of two DCs
> per domain? If not, then you MUST. This is because the GC role cannot > be
> on the same server holding the IM role.
Yes, we are set up correctly in this regard. DC2 is IM. DC3 has seized the
other 4 roles and is a GC.
>>
>> - My predecessor(s) set most of this up.
>>
>> Thanks again. Hopefully the metadata cleanup has fixed a few things.
>> I'll
>> post a follow up tomorrow. Would like to know about the SRV records,
>> though.
>>
> What about the ipconfigs?
Windows IP Configuration
Host Name . . . . . . . . . . . . : ftwd7lbj71
Primary Dns Suffix . . . . . . . : ftworth.com
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : ftworth.com
Ethernet adapter Local Area Connection:
Connection-specific DNS Suffix . : ftworth.com
Description . . . . . . . . . . . : Broadcom NetXtreme 57xx Gigabit
Cont
roller
Physical Address. . . . . . . . . : 00-22-43-C4-93-4E
Dhcp Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
IP Address. . . . . . . . . . . . : 172.16.220.170
Subnet Mask . . . . . . . . . . . : 255.255.0.0
Default Gateway . . . . . . . . . : 172.16.254.1
DHCP Server . . . . . . . . . . . : 172.16.10.17
DNS Servers . . . . . . . . . . . : 172.16.10.3
172.16.10.9
Primary WINS Server . . . . . . . : 172.16.10.9
Secondary WINS Server . . . . . . : 172.16.10.3
Lease Obtained. . . . . . . . . . : Wednesday, March 10, 2010
3:53:23 AM
Lease Expires . . . . . . . . . . : Friday, March 12, 2010 4:38:23
AM
>> How about event log errors? Post any eventID# and Source names.
Surprisingly, Event Log looks pretty clean. I don�t see any Directory
Service or DNS errors. There are a couple of File Replication errors
referencing the NEW DC I am trying to build. Maybe I should demote it back
to a member server and wait until I get all this other stuff cleaned up
before trying to promote a new DC.
I will run the SRV cleanup as you suggested. I feel like we�re getting
pretty close to having this mess cleaned up. Thanks for sticking with me.
Hopefully this will provide someone else some good documentation down the
road.
Regards
In a multi-domain forest, make sure the GC is NOT on the IM in each domain. Yes, you can have more than one GC, as long as it is not on the same server as the IM role in that domain.
Is that an ipconfig of a DC? If so, why is it set to DHCP? That will cause problems with AD.
I gave you instructions on the SRV records regarding the netlogon.dns file. Yes, you can manually delete any SRV reference to the old DC. Delete the LdapIpAddress, too.
For Exchange, restart the machine. It should re-discover the corrrect DCs/GCs that exist.
Ace