I´m having the same problem.
If you find out something to correct this, please, talk to us!!!
Thanks!!!
2016-09-12 14:12 GMT-03:00 lingpanda101--- via samba <sa...@lists.samba.org>
:
--
-------------------------------------------
Edson Tadeu Almeida Silveira
http://sites.google.com/site/edsontadeu/
-------------------------------------------
For the unsorted attributeID values errors, can you first try:
samba-tool dbcheck --cross-ncs --fix --yes 'fix_replmetadata_unsorted_attid'
There's too much going on, and it does look like it might be bailing
out. Running it with 'fix_replmetadata_unsorted_attid' should fix those
first errors, then it will probably be easier to figure out what is
happening. The 'ERROR: incorrect GUID component for member in object'
should be completely harmless (and due to objects which have been
recycled) and there's likely a fix to get rid of them to come. However,
it seems there is something else occurring which we may need to look at
in more detail.
As for the KCC, it looks like those are probably stale links from the
old KCC which connected every DC. The KCC is supposed to delete extra
connections, but this doesn't always occur (or does not occur
immediately). Simply deleting those connections should allow the new KCC
to follow all the site requirements.
If you find that DNS zones are not working correctly, this is probably
related to the failing dbcheck, and so you may want to also run:
samba-tool dbcheck --cross-ncs --fix --yes 'fix_replica_locations'
Hopefully that helps some of your issues.
Cheers,
Garming
--
Thanks Garmin. 'Samba-tool dbcheck --cross-ncs --fix --yes
'fix_replmetadata_unsorted_attid' corrected those errors. Now all that
remain are the GUID errors and several of these 'ERROR: incorrect DN
string component for member in object CN=Domain
Admins,CN=Users,DC=domain,DC=local.
The KCC errors I corrected by deleting the old KCC connections. I could
tell the difference because the old KCC doesn't set a
transport(IP,SMTP). The new KCC will create connections based on the
'Inter-Site-Transports' defined in Microsoft Active Directory Sites and
Services. However it still appears to create a full mesh. For instance
Site 1 and 3 should not be replication partners. If I look at the NTDS
for site 1, I see automatically generated connections for Site 3 with no
transport selected. Is this expected behavior?
--
-James
I'm getting several KCC errors in each of my DC's. They are as follows.
[2016/09/21 08:06:12.364447, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: AttributeError: 'NoneType' object
has no attribute 'size'
[2016/09/21 08:06:12.381710, 0, pid=1087, effective(0, 0), real(0, 0)]
../source4/dsdb/kcc/kcc_periodic.c:646(samba_kcc_done)
../source4/dsdb/kcc/kcc_periodic.c:646: Failed samba_kcc -
NT_STATUS_ACCESS_DENIED
[2016/09/21 08:11:12.870383, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: Traceback (most recent call last):
[2016/09/21 08:11:12.870528, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: File
"/usr/local/samba/sbin/samba_kcc", line 337, in <module>
[2016/09/21 08:11:12.870588, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc:
attempt_live_connections=opts.attempt_live_connections)
[2016/09/21 08:11:12.870639, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: File
"/usr/local/samba/lib/python2.7/site-packages/samba/kcc/__init__.py",
line 2644, in run
[2016/09/21 08:11:12.870994, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: all_connected = self.intersite(ping)
[2016/09/21 08:11:12.871046, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: File
"/usr/local/samba/lib/python2.7/site-packages/samba/kcc/__init__.py",
line 1883, in intersite
[2016/09/21 08:11:12.871338, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: all_connected =
self.create_intersite_connections()
[2016/09/21 08:11:12.871398, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: File
"/usr/local/samba/lib/python2.7/site-packages/samba/kcc/__init__.py",
line 1817, in create_intersite_connections
[2016/09/21 08:11:12.871676, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: part, True)
[2016/09/21 08:11:12.871724, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: File
"/usr/local/samba/lib/python2.7/site-packages/samba/kcc/__init__.py",
line 1769, in create_connections
[2016/09/21 08:11:12.871999, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: partial_ok, detect_failed)
[2016/09/21 08:11:12.872048, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: File
"/usr/local/samba/lib/python2.7/site-packages/samba/kcc/__init__.py",
line 1419, in create_connection
[2016/09/21 08:11:12.872272, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: not
cn.is_equivalent_schedule(link_sched))):
[2016/09/21 08:11:12.872321, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: File
"/usr/local/samba/lib/python2.7/site-packages/samba/kcc/kcc_utils.py",
line 1223, in is_equivalent_schedule
[2016/09/21 08:11:12.872513, 0, pid=1087, effective(0, 0), real(0, 0)]
../lib/util/util_runcmd.c:316(samba_runcmd_io_handler)
/usr/local/samba/sbin/samba_kcc: if ((self.schedule.size !=
sched.size or
Replication appears to report no errors. Running a KCC check I get the
following.
samba-tool drs kcc
ERROR(runtime): DsExecuteKCC failed - (-1073610699, 'The operation
cannot be performed.')
Switching back to the old KCC clears the errors up.
--
-James
Ignoring this error and answering your earlier question, it's often hard
to tell if the KCC is doing what is expected or not. To figure that out,
you need a lot more information about the network topology
configuration, which site links are defined, who belongs to which site
in the database and probably some of the debug logs (running samba_kcc
--debug manually). It also takes a little while for everything to settle
down, until everyone is completely aware of who is online.
Cheers,
Garming
--
I initially deleted all the NTDS site links for each site and allowed
the new KCC to create them. However it did not create them I believe
correctly. By that I mean it defined what appeared to be a bridgehead
server at each site. So I disabled the new KCC 'kccsrv:samba_kcc=false'
in my smb.conf and allowed the full mesh to be used again. After all
site links were recreated. I then switched the 'kccsrv:samba_kcc=true'
in my smb.conf and that's what prompted the following errors above.
To further expand on my Topology, I have 3 sites. I'll call them A,B and
C. Each site contains 2 DC's. Sites use different subnets and are
connected via. fiber. Sites B and C should not be replication partners.
They should only replicate with Site A(Default-First-Site-Name). With
the new KCC after deleting all the NTDS links, Sites B and C Domain
Controller #1 becomes the bridgehead server for that site. Domain
Controller #2 at sites B and C only replicates with Domain Controller #1
at it's respective site. So if the bridgehead server goes down, Domain
Controller #2 at sites B and C will no longer receive changes.
The new KCC does prevent sites B and C from replicating with each other.
That is correct. This isn't a huge issue for me. I can continue using
the old KCC for now. The full mesh isn't detrimental to my network.
Don't want to take up too much of your time. Thanks
--
-James
Thanks,
Garming
I went ahead and enabled the new KCC. I deleted all the automatically
generated NTDS links and let Samba create them. I did this through the
Microsoft Active Directory Sites and Services tool. I didn't see the
option to delete with 'samba-tool drs options --help'. I did run
'samba-tool drs kcc' to force the check and not wait. I see all the
automatically generated site links are created as you say they should.
I shutdown one of the bridgehead servers in a site (killall samba). In
my case it's SOLDC1 in Site B. I ran 'samba-tool drs kcc' on all DC's to
see if a new KCC connection would be created on SOLDC2 in site B. It
never was. So I restarted SOLDC2 in site B and no connection was ever
created. This is all with SOLDC1 in site B still down. This tells me
SOLDC2 becomes an island without anyway to replicate.
One strange thing is 'samba-tool drs showrepl' begs to differ.
root@soldc2:~# samba-tool drs showrepl
site-b\SOLDC2
DSA Options: 0x00000001
DSA object GUID: 25055641-49e7-4b3f-a7e3-9d206375306c
DSA invocationId: d11890e8-6b90-4e94-aca4-6d7a940f47b5
==== INBOUND NEIGHBORS ====
CN=Configuration,DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ Fri Sep 23 14:40:18 2016 EDT was successful
0 consecutive failure(s).
Last success @ Fri Sep 23 14:40:18 2016 EDT
DC=DomainDnsZones,DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ Fri Sep 23 14:42:24 2016 EDT was successful
0 consecutive failure(s).
Last success @ Fri Sep 23 14:42:24 2016 EDT
DC=DomainDnsZones,DC=domain,DC=local
Default-First-Site-Name\PFDC2 via RPC
DSA object GUID: e6284e90-f964-4643-b6a6-5baafdd7ba36
Last attempt @ Fri Sep 23 14:42:34 2016 EDT was successful
0 consecutive failure(s).
Last success @ Fri Sep 23 14:42:34 2016 EDT
DC=DomainDnsZones,DC=domain,DC=local
site-c\DUNDC1 via RPC
DSA object GUID: a216e718-488f-4821-8d9c-a399e6789222
Last attempt @ Fri Sep 23 14:42:32 2016 EDT was successful
0 consecutive failure(s).
Last success @ Fri Sep 23 14:42:32 2016 EDT
DC=DomainDnsZones,DC=domain,DC=local
site-c\DUNDC2 via RPC
DSA object GUID: 3c08db42-9416-40df-99ad-6d0c0ec554a6
Last attempt @ Fri Sep 23 14:41:00 2016 EDT was successful
0 consecutive failure(s).
Last success @ Fri Sep 23 14:41:00 2016 EDT
DC=DomainDnsZones,DC=domain,DC=local
Default-First-Site-Name\PFDC1 via RPC
DSA object GUID: acc2392f-9567-450f-bcb3-4fb1034b8753
Last attempt @ Fri Sep 23 14:40:58 2016 EDT was successful
0 consecutive failure(s).
Last success @ Fri Sep 23 14:40:58 2016 EDT
CN=Schema,CN=Configuration,DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ Fri Sep 23 14:40:19 2016 EDT was successful
0 consecutive failure(s).
Last success @ Fri Sep 23 14:40:19 2016 EDT
DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ Fri Sep 23 14:40:20 2016 EDT was successful
0 consecutive failure(s).
Last success @ Fri Sep 23 14:40:20 2016 EDT
DC=ForestDnsZones,DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ Fri Sep 23 14:40:18 2016 EDT was successful
0 consecutive failure(s).
Last success @ Fri Sep 23 14:40:18 2016 EDT
==== OUTBOUND NEIGHBORS ====
CN=Configuration,DC=domain,DC=local
site-c\DUNDC2 via RPC
DSA object GUID: 3c08db42-9416-40df-99ad-6d0c0ec554a6
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
CN=Configuration,DC=domain,DC=local
Default-First-Site-Name\PFDC1 via RPC
DSA object GUID: acc2392f-9567-450f-bcb3-4fb1034b8753
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
CN=Configuration,DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
CN=Configuration,DC=domain,DC=local
Default-First-Site-Name\PFDC2 via RPC
DSA object GUID: e6284e90-f964-4643-b6a6-5baafdd7ba36
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
CN=Configuration,DC=domain,DC=local
site-c\DUNDC1 via RPC
DSA object GUID: a216e718-488f-4821-8d9c-a399e6789222
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=DomainDnsZones,DC=domain,DC=local
site-c\DUNDC2 via RPC
DSA object GUID: 3c08db42-9416-40df-99ad-6d0c0ec554a6
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=DomainDnsZones,DC=domain,DC=local
Default-First-Site-Name\PFDC1 via RPC
DSA object GUID: acc2392f-9567-450f-bcb3-4fb1034b8753
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=DomainDnsZones,DC=domain,DC=local
Default-First-Site-Name\PFDC2 via RPC
DSA object GUID: e6284e90-f964-4643-b6a6-5baafdd7ba36
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=DomainDnsZones,DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=DomainDnsZones,DC=domain,DC=local
site-c\DUNDC1 via RPC
DSA object GUID: a216e718-488f-4821-8d9c-a399e6789222
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
CN=Schema,CN=Configuration,DC=domain,DC=local
site-c\DUNDC2 via RPC
DSA object GUID: 3c08db42-9416-40df-99ad-6d0c0ec554a6
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
CN=Schema,CN=Configuration,DC=domain,DC=local
Default-First-Site-Name\PFDC1 via RPC
DSA object GUID: acc2392f-9567-450f-bcb3-4fb1034b8753
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
CN=Schema,CN=Configuration,DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
CN=Schema,CN=Configuration,DC=domain,DC=local
Default-First-Site-Name\PFDC2 via RPC
DSA object GUID: e6284e90-f964-4643-b6a6-5baafdd7ba36
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
CN=Schema,CN=Configuration,DC=domain,DC=local
site-c\DUNDC1 via RPC
DSA object GUID: a216e718-488f-4821-8d9c-a399e6789222
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=domain,DC=local
Default-First-Site-Name\PFDC1 via RPC
DSA object GUID: acc2392f-9567-450f-bcb3-4fb1034b8753
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=domain,DC=local
site-c\DUNDC2 via RPC
DSA object GUID: 3c08db42-9416-40df-99ad-6d0c0ec554a6
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=domain,DC=local
Default-First-Site-Name\PFDC2 via RPC
DSA object GUID: e6284e90-f964-4643-b6a6-5baafdd7ba36
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=domain,DC=local
site-c\DUNDC1 via RPC
DSA object GUID: a216e718-488f-4821-8d9c-a399e6789222
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=ForestDnsZones,DC=domain,DC=local
site-c\DUNDC2 via RPC
DSA object GUID: 3c08db42-9416-40df-99ad-6d0c0ec554a6
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=ForestDnsZones,DC=domain,DC=local
Default-First-Site-Name\PFDC1 via RPC
DSA object GUID: acc2392f-9567-450f-bcb3-4fb1034b8753
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=ForestDnsZones,DC=domain,DC=local
site-b\SOLDC1 via RPC
DSA object GUID: 55e069f5-4f47-415b-8fa4-a398948235aa
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=ForestDnsZones,DC=domain,DC=local
Default-First-Site-Name\PFDC2 via RPC
DSA object GUID: e6284e90-f964-4643-b6a6-5baafdd7ba36
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
DC=ForestDnsZones,DC=domain,DC=local
site-c\DUNDC1 via RPC
DSA object GUID: a216e718-488f-4821-8d9c-a399e6789222
Last attempt @ NTTIME(0) was successful
0 consecutive failure(s).
Last success @ NTTIME(0)
==== KCC CONNECTION OBJECTS ====
Connection --
Connection name: 7b7ddab7-4377-44f4-9831-8fe7feb55115
Enabled : TRUE
Server DNS name : SOLDC1.domain.local
Server DN name : CN=NTDS
Settings,CN=SOLDC1,CN=Servers,CN=site-b,CN=Sites,CN=Configuration,DC=domain,DC=local
TransportType: RPC
options: 0x00000001
Warning: No NC replicated for Connection!
I have what appears to still be a full mesh replication. Shouldn't the
outbound and inbound neighbors be reflective of the KCC connection
objects? I would expect to find only inbound and outbound connections
for SOLDC1. Maybe I'm completely misinterpreting the intended behavior.
--
-James
There's likely at least some stale entries (repsFrom). The KCC builds
the inbound connections for each DC. Then as a separate step translates
the connections to replication links. The outbound links are mostly the
other DCs problem (likely an old repsFrom pulling from SOLDC1). I've
taken quite a few steps to rid the DCs of as many old repsFrom entries
as possible from within the KCC, but based on time delays and use of the
old KCC, this may not be enough in its current state to be equivalent to
a fresh domain.
I've taken another look and it's plausible that the failover for inbound
connections won't occur for 2 hours thanks to the default of the
interSiteTopologyFailover variable on the site objects. I would be
interested as to result if you set the variable (which I think is in
minutes) to something much lower.
This area is definitely not simple. And has a lot of room to improve
(One bug I see here is 'Last attempt @ NTTIME(0) was successful' which
has an unmerged fix to get the right time I believe). But it is a vast
improvement on the old code, especially at scale.
Cheers,
Garming
....
the job of the samba_kcc script is to create the ntdsConnection objects.
Afterward the repsFrom/repsTo attribute are created in accordance with
the ntdsConnection objects (you can force the creation using samba-tool
drs replicate although). You can check that the process is asynchronous
when you join a new DC, the INBOUND and OUTBOUND entries are coming
later on after the ntdsConnection object has been created.
You can find repsFrom/repsTo attributes at on the root ldap entries of
each of the five AD partitions. Those entries correspond to the INBOUND
and OUTBOUND display in the samba-tool drs showrepl command.
However there is currently no standard way to delete the leftover of
repsfrom/repsto, others than deleting the repsFrom/repsTo attribute
manually or through scripting (python-ldb is your friend here).
I had a discussion with Garming a while ago about this issue, and it was
not clear what process was responsible to remove spurious/leftover
repsfrom/repsto attribute. With the old kcc, it was not such an issue
because it was full meshed, however with the new KCC, it would indeed be
good to have some more tooling for drs maintenance and monitoring.
By the way, KCC computation algorithm specifications from Microsoft are
kind of mind boggling, so there might need some more tweaking, but
thanks to Garming it is has done the job for us since 4.3.0 for almost
one year.
Cheers,
Denis
>
>
--
Denis Cardon
Tranquil IT Systems
Les Espaces Jules Verne, bâtiment A
12 avenue Jules Verne
44230 Saint Sébastien sur Loire
tel : +33 (0) 2.40.97.57.55
http://www.tranquil-it-systems.fr
...
>> I have what appears to still be a full mesh replication. Shouldn't the
>> outbound and inbound neighbors be reflective of the KCC connection
>> objects? I would expect to find only inbound and outbound connections
>> for SOLDC1. Maybe I'm completely misinterpreting the intended
>> behavior.
>
> There's likely at least some stale entries (repsFrom). The KCC builds
> the inbound connections for each DC. Then as a separate step translates
> the connections to replication links. The outbound links are mostly the
> other DCs problem (likely an old repsFrom pulling from SOLDC1). I've
> taken quite a few steps to rid the DCs of as many old repsFrom entries
> as possible from within the KCC, but based on time delays and use of the
> old KCC, this may not be enough in its current state to be equivalent to
> a fresh domain.
About the cleanup of repsFrom/repsTo, is the cleanup code included in
4.4.5 or only in 4.5? I have not yet found time to test that new
version, but at least in 4.4.5, I still have the behavior where leftover
repsFrom/repsTo are not automatically deleted. I hope to find time to
test 4.5 next week.
Cheers,
Denis
PS : sorry for my parallel response to that thread, I didn't see your
mail before hitting the send button.
>
> I've taken another look and it's plausible that the failover for inbound
> connections won't occur for 2 hours thanks to the default of the
> interSiteTopologyFailover variable on the site objects. I would be
> interested as to result if you set the variable (which I think is in
> minutes) to something much lower.
>
> This area is definitely not simple. And has a lot of room to improve
> (One bug I see here is 'Last attempt @ NTTIME(0) was successful' which
> has an unmerged fix to get the right time I believe). But it is a vast
> improvement on the old code, especially at scale.
>
>
> Cheers,
>
> Garming
>
--
Denis Cardon
Tranquil IT Systems
Les Espaces Jules Verne, bâtiment A
12 avenue Jules Verne
44230 Saint Sébastien sur Loire
tel : +33 (0) 2.40.97.57.55
http://www.tranquil-it-systems.fr
There are some improvements made in 4.5.0 for the KCC in regards to
removing repsFrom/To. RepsTo should no longer push updates to dead DCs
(as well as repsFrom) and there was some changes to fix some issues with
DomainDns and ForestDns partitions (where the NCReplicaLocations
attribute was not set).
Cheers,
Garming
--
Garming,
What is the command and syntax to query Samba for the
interSiteTopologyFailover variable? If I use ADSI edit to view the
variable it displays as '<not set>'.
What's also odd is the interSiteTopologyGenerator variable. My
understanding is the ISTG should only be defined on one DC in each site.
Which it is but it's the DC that is defined that's odd. In my case for
'Site-B', it's SOLDC2. That would explain why shutting SOLDC1 didn't
prompt the KCC to create new NTDS connections. SOLDC1 is the DC that has
automatically generated connections to the 'Default-First-Site-Name'.
SOLDC2 only has KCC connections to SOLDC1. Isn't Samba defining the
incorrect server as being the ISTG bridgehead server? This is the case
for my other two sites as well. Thanks.
--
-James
Wasn't aware of this. Thank you for the info. If I was to delete the
incorrect respsFrom/repsTo attributes, wouldn't the KCC just regenerate
them over time once the KCC check and ISTG check kicked in?
--
-James
I should point out for reference how I was trying to query the
interSiteTopologyFailover variable.
ldbsearch -H usr/local/samba/private/sam.ldb
'(&(objectclass=person)(name=Guest))' name intersiteTopologyFailover
# record 1
dn: CN=Guest,CN=Users,DC=domain,DC=local
name: Guest
# Referral
ref: ldap://domain.local/CN=Configuration,DC=domain,DC=local
# Referral
ref: ldap://domain.local/DC=DomainDnsZones,DC=domain,DC=local
# Referral
ref: ldap://domain.local/DC=ForestDnsZones,DC=domain,DC=local
# returned 4 records
# 1 entries
# 3 referrals
Is this correct?
--
-James
>> the job of the samba_kcc script is to create the ntdsConnection
>> objects. Afterward the repsFrom/repsTo attribute are created in
>> accordance with the ntdsConnection objects (you can force the creation
>> using samba-tool drs replicate although). You can check that the
>> process is asynchronous when you join a new DC, the INBOUND and
>> OUTBOUND entries are coming later on after the ntdsConnection object
>> has been created.
>>
>> You can find repsFrom/repsTo attributes at on the root ldap entries of
>> each of the five AD partitions. Those entries correspond to the
>> INBOUND and OUTBOUND display in the samba-tool drs showrepl command.
>>
>> However there is currently no standard way to delete the leftover of
>> repsfrom/repsto, others than deleting the repsFrom/repsTo attribute
>> manually or through scripting (python-ldb is your friend here).
>>
>> I had a discussion with Garming a while ago about this issue, and it
>> was not clear what process was responsible to remove spurious/leftover
>> repsfrom/repsto attribute. With the old kcc, it was not such an issue
>> because it was full meshed, however with the new KCC, it would indeed
>> be good to have some more tooling for drs maintenance and monitoring.
>>
>> By the way, KCC computation algorithm specifications from Microsoft
>> are kind of mind boggling, so there might need some more tweaking, but
>> thanks to Garming it is has done the job for us since 4.3.0 for almost
>> one year.
>>
>
> Wasn't aware of this. Thank you for the info. If I was to delete the
> incorrect respsFrom/repsTo attributes, wouldn't the KCC just regenerate
> them over time once the KCC check and ISTG check kicked in?
like Garming was saying, there is a separate step from the KCC topology
calculation to translate the ntdsConnection objects to replication
links. That separate process create the attribute based on the
ntdsConnection object, so if there is no spurious ntdsConnection object,
it won't create the spurious replication links.
However there is a caveat at this time. repsFrom attributes on one DC
are the mirror of the repsTo attribute from the remote DC. And a
repsFrom on one DC will trigger the re-creation of the repsTo on
corresponding remote DC... So when you want to do the cleanup, then you
have to firewall the two DC so dreplsrv service cannot see each other,
delete the spurious attributes and then remove the firewalling.
Yes it is not very convenient, but with a little bit of scripting, you
can do it very fast, I did it recently on a 50 DCs network.
Cheers,
Denis
--
Denis Cardon
Tranquil IT Systems
Les Espaces Jules Verne, bâtiment A
12 avenue Jules Verne
44230 Saint Sébastien sur Loire
tel : +33 (0) 2.40.97.57.55
http://www.tranquil-it-systems.fr
As long as the topology doesn't change or DCs which are not bridgeheads
do not go offline, there should be basically zero additional reps over
time. How often they build up over time is an open question (when DCs do
go offline), I can't test every setup and I'm sure there are edge cases.
However if there are these additional links for when you have spuriously
unreliable DCs, they work just as well as a fallback.
The interSiteTopologyFailover attribute seems to be on the
NTDS-Site-Settings class. By default it probably isn't defined, but the
internal default value in both Samba and Windows is 2 hours.
The ITSG is not the same as the bridgehead server. The ITSG is a single
DC in the site which coordinates all the DCs and picks bridgehead
servers in the site to talk to other sites (at some DC bridgehead
arbitrarily chosen on the other end). The reason I ask who the ITSG was
is because if the ITSG is dead, it is reasonable to expect that there is
no current coordinator who is site-aware, and so no fallback has
occurred yet.
Cheers,
Garming
Cheers,
Garming
This is what seems to be stumbling me, however I think I understand a
bit better. Samba isn't defining a bridgehead server(which I do not
want). I was under the impression the owner of the ISTG was in fact a
bridgehead server. Reading this link
https://support.microsoft.com/en-us/kb/224815 tells me 'The domain
controller holding this role may not necessarily also be a bridgehead
server'. To verify I queried for the CN 'Bridgehead-Server-List-BL'
which is also not set. Is this hard coded in Samba and I'm unable to see
it or is this not the correct attribute to confirm?
The link also references how a DC alerts other DC's that a ISTG has gone
down in a site. This is the critical component I was worried about. Is
this feature currently implemented in Samba? On a Microsoft DC you can
alert how often you want to check for the ISTG in a registry setting. Do
you have plans to add this as a option for the smb.conf?
I will also point out Samba did correctly set the ISTG for my sites to
DC1. The first DC I joined to that site. After deleting the NTDS
connections, I see that my second DC in a site was chosen as the ISTG.
This tells me some sort of check may be happening to switch the ISTG?
Based on all this it appears the new KCC does in fact work correctly
with a few minor issues relating to the replications To and From. Thanks
for the hard work.
>
> Cheers,
>
> Garming
--
-James
I was mistaken on another point. I ran 'samba_kcc --debug' and saw
mention of bridgehead server. Reading additional documentation I see a
difference between a 'bridgehead server' and a 'preferred bridgehead
server'. It's the preferred bridgehead sever I do not want defined. This
is all starting to become clearer.
That likely maps to some attributes in the directory which we probably
partly respect. I think a bit more work is needed to allow the failover
times to have better configuration.
> I will also point out Samba did correctly set the ISTG for my sites to
> DC1. The first DC I joined to that site. After deleting the NTDS
> connections, I see that my second DC in a site was chosen as the ISTG.
> This tells me some sort of check may be happening to switch the ISTG?
>
The role can rotate from time to time for various reasons, so doesn't
necessarily stay fixed.
> Based on all this it appears the new KCC does in fact work correctly
> with a few minor issues relating to the replications To and From.
> Thanks for the hard work.
It's still a work in progress and there are some failover modes which I
would like to improve, but in a happy, relatively healthy network,
replication traffic should be heavily decreased, CPU usage should be
less of a problem and the DC should suffer fewer blockages. With this
work, I look forward to seeing what Samba will be able to achieve in
future.
Cheers,
Garming