Samba 3.5.6~dfsg-1 on Debian Squeeze. Updates are current.
Ever since upgrading from 3.4.x to 3.5.x, winbind_cache.tdb is
repeatedly corrupted, rendering authentication impossible.
I have a script that checks every 15 minutes for connectivity to the AD
server and restarts samba and winbind if the connection is corrupted.
This happens on every winbind system, so it is not system specific.
ls of this file in /var/cache/samba is as follows:
-rw------- 1 root root 139264 Nov 16 11:21 winbindd_cache.tdb
-rw------- 1 root root 946176 Nov 16 11:20 winbindd_cache.tdb.bak
-rw------- 1 root root 946176 Nov 16 09:16 winbindd_cache.tdb.bak.old
The corrupted tdb's are almost 7 times as large as the working version.
In 3.3.5, a winbind_cache.corrupt was generated at each restart after
a corruption. In 3.5.6, 2 backups are maintained instead. This is the
only difference I have noted between the two versions.
I notice this is repeated in the logs at the time of the corruption:
[2010/11/16 11:19:40.933366, 10]
winbindd/winbindd_cache.c:4674(wcache_fetch_ndr)
Entry has wrong sequence number: 1573542
[2010/11/16 11:19:40.933684, 1] winbindd/winbindd_util.c:289(trustdom_recv)
Could not receive trustdoms
A level 10 log during a corruption is attached.
Dependencies:
adduser 3.112
libc6 2.11.2-7
libcomerr2 1.41.12-2
libkrb53 1.6.dfsg.4~beta1-13
libldap-2.4-2 2.4.23-6
libpam0g 1.1.1-6.1
libpopt0 1.16-1
libtalloc1 1.2.0~git20080616-1
libwbclient0 2:3.5.6~dfsg-1
lsb-base 3.2-23.1
samba-common 2:3.5.6~dfsg-1
Also:
linux-image-2.6.32-5-686 2.6.32-27
smb.conf:
[global]
workgroup = DOMAIN
realm = DOMAIN.COM
server string = %h server (Samba %v)
security = ADS
allow trusted domains = No
map to guest = Bad User
obey pam restrictions = Yes
password server = ad_dc
passdb backend = tdbsam
username map = /etc/samba/users.map
log level = 1 winbind:10 idmap:4
log file =/var/log/samba/%m
max log size = 1000
name resolve order = wins host bcast
deadtime = 15
load printers = No
printcap name = cups
wins proxy = Yes
wins server = 192.168.x.xxx
ldap ssl = no
panic action = /usr/share/samba/panic-action %d
#idmap backend = rid:DOMAIN=1000-20000000
idmap backend = tdb
idmap uid = 1000-20000000
idmap gid = 1000-20000000
idmap config DOMAIN : backend = rid
idmap config DOMAIN : range = 1000 - 20000000
template homedir =/home/domain/%U
template shell = /bin/bash
winbind cache time = 10
winbind enum users = Yes
winbind enum groups = Yes
winbind use default domain = Yes
winbind offline logon = Yes
admin users = root, DOMAIN\user1, "@DOMAIN\group1"
ea support = Yes
map archive = No
map readonly = no
store dos attributes = Yes
I found that the corruptions occurred less frequently if I converted to
the newer idmap_rid syntax; however, they still occur at the rate of 5
to 6 daily.
Thanks,
Dale Schroeder
Thanks for all details. I'll forward this upstream ASAP and will let
you know through the BTS. It would be good if yu can also check
upstream's bug report in case they ask for more details or tests as
this problem has chances to be related to your setup or environment.
Will let you know when the bug is forwarded upstream (I need to be
online for this, thanks to crappy Bugzilla that one can't use by mail
only).
Quoting Dale Schroeder (da...@BriannasSaladDressing.com):
> Package: winbind
> Version: 2:3.5.6~dfsg-1
>
> Samba 3.5.6~dfsg-1 on Debian Squeeze. Updates are current.
>
> Ever since upgrading from 3.4.x to 3.5.x, winbind_cache.tdb is
> repeatedly corrupted, rendering authentication impossible.
This is now forwarded upstream as #7818.
We got the following comments/requests:
https://bugzilla.samba.org/show_bug.cgi?id=7818
------- Comment #2 from v...@samba.org 2010-11-24 14:44 CST -------
These "entry has wrong sequence number" messages are expected if something
changes on the AD DC, they are normal, they are not a failure. We need debug
level 10 logs of the actually failed request. log.winbindd* and log.wb*
(there's more than one winbind log file since version 3.2).
Dale, is there any chance you can get these?