As the topic says, I've experienced some unusual sshd behavior after I moved
some of my systems from RELENG_5_3 to RELENG_5 recently. The unusuality of the
behavior is illustrated by the following exerpt from the /var/log/auth.log on
the RELENG_5 system:
Jan 29 14:53:38 mail sshd[699]: login_getclass: unknown class 'root'
Jan 29 14:53:38 mail last message repeated 3 times
Jan 29 14:53:38 mail sshd[699]: Accepted publickey for root from 192.168.0.1 port 60094 ssh2
Jan 29 14:53:38 mail sshd[698]: Accepted publickey for root from 192.168.0.1 port 60094 ssh2
Jan 29 15:32:15 mail sshd[836]: login_getclass: unknown class 'root'
Jan 29 15:32:15 mail last message repeated 3 times
Jan 29 15:32:15 mail sshd[836]: Accepted publickey for root from 192.168.0.1 port 53837 ssh2
Jan 29 15:32:15 mail sshd[835]: Accepted publickey for root from 192.168.0.1 port 53837 ssh2
Jan 29 16:40:16 mail sshd[1034]: login_getclass: unknown class 'root'
Jan 29 16:40:16 mail last message repeated 3 times
Jan 29 16:40:16 mail sshd[1034]: Accepted publickey for root from 192.168.0.1 port 54714 ssh2
Jan 29 16:40:16 mail sshd[1033]: Accepted publickey for root from 192.168.0.1 port 54714 ssh2
Jan 29 17:10:27 mail sshd[1125]: login_getclass: unknown class 'root'
Jan 29 17:10:27 mail last message repeated 3 times
Jan 29 17:10:27 mail sshd[1125]: Accepted publickey for root from 192.168.0.1 port 54337 ssh2
Jan 29 17:10:27 mail sshd[1124]: Accepted publickey for root from 192.168.0.1 port 54337 ssh2
All of the systems have login.conf which contains entry for a root class. I've
rebuild the login.conf.db database to make sure that it's not a filesystem
glitch and even copied the default login.conf from /usr/src followed by
rebuilding the login.conf.db database, but none of that helped. The manual page
for the login_getclassbyname() explicitely states:
In addition, if the referenced user has a UID of 0 (normally, "root", although
the user name is not considered) then login_getpwclass() will search for
a record with an id of "root" before it searches for the record with the id of
"default".
So, the "root" entry IS there but for some reason either sshd is being buggy or
login_getclassbyname() is behaving strangely because as far as I know this
shouldn't be happening.
Also, for some reason, for each successful login attempt there are two
identical entries apparently made by two different instances/fork's of sshd
since they have different PID's. This started happening the same time when the
first problem appeared, which is after recent upgrade from RELENG_5_3 to
RELENG_5.
I've taken a diff between RELENG_5_3 and RELENG_5 but didn't find any obvious
changes that could have led to this unusual situation. I guess that only
somewhat related change could be the addition of "logpriv" mechanism for
protection against consequences of syslogd flooding.
To convince myself that all of this is specific to RELENG_5_3 -> RELENG_3
upgrade, I've just reversed one of the systems back to RELENG_5_3 and all of
the above mentioned problems have disappeared. All of the upgrades and
downgrades have been accompanied with mergemaster.
Some addition info about the "mail" system above:
mail# uname -rs
FreeBSD 5.3-STABLE
mail# grep ssh /etc/rc.conf
sshd_enable="YES"
mail# grep syslog /etc/rc.conf
syslogd_flags="-4 -s -b 192.168.0.7"
mail# grep root /etc/master.passwd | head -1
root:*:0:0::0:0:Andrew Konstantinov:/root:/bin/csh
mail# grep -EA 3 '^root:\\' /etc/login.conf
root:\
:ignorenologin:\
:tc=default:
mail#
Am I missing something obvious here? Any pointes on debugging this? Please, let
me know if additional info is needed.
Thanks,
Andrew
Silence... Does that mean that reading auth.log is out of fashion now or that
nobody has seen anything similar in RELENG_5 systems? Could someone either
confirm or disprove the existence of those two bugs before I file a PR?
Thanks in advance,
Andrew
On Sun, 30 Jan 2005, Andrew Konstantinov wrote:
> Hello,
>
> As the topic says, I've experienced some unusual sshd behavior after I moved
> some of my systems from RELENG_5_3 to RELENG_5 recently. The unusuality of the
> behavior is illustrated by the following exerpt from the /var/log/auth.log on
> the RELENG_5 system:
>
> Jan 29 14:53:38 mail sshd[699]: login_getclass: unknown class 'root'
I can't reproduce this on my systems, many of which started at 5.3 and now
build 5-stable. Are you using the system ssh or one you built from ports?
What is the output of 'ls -l /etc/login.conf*'?
--
Doug White | FreeBSD: The Power to Serve
dwh...@gumbysoft.com | www.FreeBSD.org
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"
mail# uname -rs
FreeBSD 5.3-STABLE
mail# date
Mon Jan 31 16:53:00 PST 2005
mail# ls -l /etc/login.conf*
-rw-r--r-- 1 root wheel 6522 Jan 29 14:09 /etc/login.conf
-rw-r--r-- 1 root wheel 65536 Jan 29 14:09 /etc/login.conf.db
mail# grep -A 3 -E '^root' /etc/login.conf
root:\
:ignorenologin:\
:tc=default:
mail# tail -4 /var/log/auth.log
Jan 31 16:52:59 mail sshd[14262]: login_getclass: unknown class 'root'
Jan 31 16:52:59 mail last message repeated 3 times
Jan 31 16:52:59 mail sshd[14262]: Accepted publickey for root from 192.168.0.1 port 59976 ssh2
Jan 31 16:52:59 mail sshd[14261]: Accepted publickey for root from 192.168.0.1 port 59976 ssh2
mail#
I'm using the system supplied ssh client and server. All of this is really
confusing to me. Three of my systems were initially running 5.2.1, then were
upgraded to 5.3 release and then followed the vector of p1, p2, p3, p4, and p5
updates. But, a few days ago I moved all of them to RELENG_5 and this weirdness
came up. The most interesting part is that when I downgrade back to RELENG_5_3,
all of this disappears.
Here is what happens to sshd in debug mode:
mail# sshd -ddd
debug2: read_server_config: filename /etc/ssh/sshd_config
debug1: sshd version OpenSSH_3.8.1p1 FreeBSD-20040419
[...]
debug3: mm_request_receive entering
debug3: monitor_read: checking request 22
debug1: ssh_dss_verify: signature correct
debug3: mm_answer_keyverify: key 0x80789b0 signature verified
debug3: mm_request_send entering: type 23
Accepted publickey for root from 192.168.0.1 port 63791 ssh2
debug1: monitor_child_preauth: root has been authenticated by privileged process
debug3: mm_get_keystate: Waiting for new keys
debug3: mm_request_receive_expect entering: type 24
debug3: mm_request_receive entering
debug2: userauth_pubkey: authenticated 1 pkalg ssh-dss
Accepted publickey for root from 192.168.0.1 port 63791 ssh2
debug3: mm_send_keystate: Sending new keys: 0x8079500 0x80794c0
debug3: mm_newkeys_to_blob: converting 0x8079500
debug3: mm_newkeys_to_blob: converting 0x80794c0
debug3: mm_send_keystate: New keys have been sent
debug3: mm_send_keystate: Sending compression state
debug3: mm_request_send entering: type 24
debug3: mm_send_keystate: Finished sending state
[...]
Here is my make.conf on this particular system:
mail# grep -v '^#' /etc/make.conf
CFLAGS= -O -pipe
COPTFLAGS= -O -pipe
CPUTYPE= p2
KERNCONF= CUSTOM
MAKE_IDEA= YES
NOATM= true
NOGAMES= true
NO_BLUETOOTH= true
NO_FORTRAN= true
NO_I4B= true
NO_PF= true
NO_AUTHPF= true
NO_IPFILTER= true
NO_KERBEROS= true
NO_LPR= true
NO_NIS= true
NO_SENDMAIL= true
PPP_NOSUID= true
PRINTERDEVICE= ascii
WITH_OPTIMIZED_CFLAGS= true
X_WINDOW_SYSTEM=xorg
PERL_VER=5.8.5
PERL_VERSION=5.8.5
PERL_ARCH=mach
NOPERL=yo
NO_PERL=yo
NO_PERL_WRAPPER=yo
mail#
In case if it matters, root accounts on those servers do not use passwords for
authentication. The authentication is done solely by public/private ssh keys.
mail# grep root /etc/master.passwd | head -1
root:*:0:0::0:0:Andrew Konstantinov:/root:/bin/csh
mail# mount | head -1
/dev/ad0s1a on / (ufs, local, read-only)
mail# sysctl kern.securelevel
kern.securelevel: 2
mail#
I suppose the kernel config file should not be necessary. :) Any ideas at all?
Thanks in advance,
Andrew
I knew I wasn't hallucinating. When I rebuild and reinstall src/lib/libc from
RELENG_5_3 sources on RELENG_5 system, all of the above problems disappear
altogether. The bugs are in the dynamically linked library that sshd relies on.
Once the new library is in place and "/etc/rc.d/sshd restart" is performed, the
bugs disappear. I don't have time to dig into that right now, but I'll be back
with patches.
Andrew
P.S. And nobody believed me, you people! :)
> > > I can't reproduce this on my systems, many of which started at 5.3 and now
> > > build 5-stable. Are you using the system ssh or one you built from ports?
> > >
> > > What is the output of 'ls -l /etc/login.conf*'?
>
> I knew I wasn't hallucinating. When I rebuild and reinstall src/lib/libc
> from RELENG_5_3 sources on RELENG_5 system, all of the above problems
> disappear altogether. The bugs are in the dynamically linked library
> that sshd relies on. Once the new library is in place and
> "/etc/rc.d/sshd restart" is performed, the bugs disappear. I don't have
> time to dig into that right now, but I'll be back with patches.
The simple fact stands that noone else can reproduce this, which leads me
to believe you took a non-standard approach to upgrading, and therefore
are getting what you asked for. :-)
If you can provide exact reproduction steps, starting from bare metal,
I'll follow them.
No algorithm for reproduction yet, but here is some additional information
regarding this issue:
First of all, I just rebuild everything in the system twice, following the
proper sequence each time. Here are the steps I've taken:
- cvsup /usr/src with RELENG_5
- cd /usr/src && make buildworld buildkernel installkernel
- reboot into single user mode
- mount all
- cd /usr/src && make installworld
- mergemaster
- find /bin /sbin /lib /libexec /usr/bin /usr/sbin /usr/lib /usr/libexec \
/usr/libdata /usr/include -ctime +1d -exec rm -rf {} \;
- reboot
- rm -rf /usr/include/*
- cd /usr/src && make includes
- cd /usr/src && make buildworld buildkernel installkernel
- reboot into single user mode
- mount all
- cd /usr/src && make installworld
- mergemaster
- find /bin /sbin /lib /libexec /usr/bin /usr/sbin /usr/lib /usr/libexec \
/usr/libdata /usr/include -ctime +1d -exec rm -rf {} \;
- reboot
That sequence of steps should guarantee that none of the old libraries or old
includes in the /usr/include find their way into the upgraded system. Sadly,
this didn't change anything.
The other important thing that I've noticed is that when I set
UsePrivilegeSeparation in sshd_config to "no", all those bugs disappear.
I'll try to come up with a recipe for reproduction once I have enough time.
Andrew
Also, when I traced sshd in debug mode using gdb, I've found that
/usr/src/lib/libc/gen/getcap.c lines 246 - 274 work properly and return the
valid "root" entry from the login database and that code is enclosed in the
else statement that is a part of "if (fd >= 0)" construction. So, I apparently,
something gets to getent around cgetent with already existing file
descriptor which causes a different portion of code to be executed
(instead of 246 - 274) which in its turn causes a problem. Perhaps the
descriptor is poing to a wrong file?
Andrew
First of all, I apologize for the incorrect diagnosis. The real bug is not in
the upgrade from RELENG_5_3 to RELENG_5. Secure shell's odd behavior is caused
by "NO_NIS=true" in the /etc/make.conf. The reason why the bug disappeared when
I reversed src/lib/libc from RELENG_5 back to RELENG_5_3 is because of:
warrior# cvs rdiff -u -rRELENG_5_3 -rRELENG_5 src/lib/libc/Makefile
Index: src/lib/libc/Makefile
diff -u src/lib/libc/Makefile:1.52 src/lib/libc/Makefile:1.52.2.1
--- src/lib/libc/Makefile:1.52 Fri May 14 12:04:29 2004
+++ src/lib/libc/Makefile Sun Nov 28 14:10:16 2004
@@ -1,5 +1,5 @@
# @(#)Makefile 8.2 (Berkeley) 2/3/94
-# $FreeBSD: src/lib/libc/Makefile,v 1.52 2004/05/14 12:04:29 cognet Exp $
+# $FreeBSD: src/lib/libc/Makefile,v 1.52.2.1 2004/11/28 14:10:16 bz Exp $
#
# All library objects contain FreeBSD revision strings by default; they may be
# excluded as a space-saving measure. To produce a library that does
@@ -60,7 +60,7 @@
.if ${MACHINE_ARCH} == "arm"
.include "${.CURDIR}/softfloat/Makefile.inc"
.endif
-.if !defined(NO_YP_LIBC)
+.if !defined(NO_NIS)
CFLAGS+= -DYP
.include "${.CURDIR}/yp/Makefile.inc"
.endif
When I reversed my system back to RELENG_5_3, that effectively disabled the "NO_NIS=true" flag in /etc/make.conf. So, the good news is that I get to have clean logs after removal of "NO_NIS=true" from /etc/make.conf.
*Possible* exact reproduction steps:
- install RELENG_5
- rebuild RELENG_5 with "NO_NIS=true" in /etc/make.conf
- restart sshd service
The reason why they are "possible" is because I'm not sure if that is the only
condition that has to be present in the system in order for the bug to appear.
Can anyone confirm this?
Andrew
> *Possible* exact reproduction steps:
> - install RELENG_5
> - rebuild RELENG_5 with "NO_NIS=true" in /etc/make.conf
> - restart sshd service
Sorry, no dice. I had to set "PermitRootLogin yes" in
/etc/ssh/sshd_config but logging in as root with password succeeds with no
login class warning. Upgraded from a RELENG_5 from yesterday to one about
90 minutes old.
What is the contents of /etc/nsswitch.conf? bz is telling me that if you
still have 'nis' in the lines in nsswitch and you compile with NO_NIS that
you'll get wierd user lookup errors.
Also what are the contents of /etc/make.conf?
#--- The nsswitch.conf:
group: compat
group_compat: nis
hosts: files dns
networks: files
passwd: compat
passwd_compat: nis
shells: files
#----------------------
Hmm, I completely forgot about that one. :( I guess 'nis' should have been
switched to 'files' whenever system is compiled with "NO_NIS=true".
#--- current make.conf:
CFLAGS= -O -pipe
COPTFLAGS= -O -pipe
CPUTYPE= athlon-xp
KERNCONF= CUSTOM
MAKE_IDEA= YES
NOATM= true
NOGAMES= true
NO_BLUETOOTH= true
NO_FORTRAN= true
NO_I4B= true
NO_PF= true
NO_AUTHPF= true
NO_IPFILTER= true
NO_KERBEROS= true
NO_LPR= true
NO_SENDMAIL= true
PPP_NOSUID= true
PRINTERDEVICE= ascii
WITH_OPTIMIZED_CFLAGS= true
X_WINDOW_SYSTEM=xorg
PERL_ARCH=mach
NOPERL=yo
NO_PERL=yo
NO_PERL_WRAPPER=yo
PERL_VER=5.6.1
PERL_VERSION=5.6.1
#----------------------
Andrew
it's not documented - sorry, will do that.
change it to sth like:
group: files
hosts: files dns
networks: files
passwd: files
shells: files
w/o this change I can see sth like this when doing passwd auth:
'sshd[1995]: NSSWITCH(nss_method_lookup): nis, passwd_compat, endpwent, not found'
But I suspect this will not help with your problem.
Did you change your login.conf?
Could you mail me (private mail please) the library with which you can
see the problems?
--
Bjoern A. Zeeb bzeeb at Zabbadoz dot NeT
Actually, that solves all the problems. Once I switched to your version of
nsswitch.conf, all the "unknown class" bugs and multiple logging events have
disappeared.
> Did you change your login.conf?
I always used the one that FreeBSD suplies, without any modifications. I even
copied it from /usr/src/ multiple times and rebuilt the database from it to
ensure that it's not some sort of filesystem glitch.
> Could you mail me (private mail please) the library with which you can
> see the problems?
libc.so.5 with debug symbols is on its way to bz@
As a sidenote: I definitely agree that it should be documented. Also, it's my
personal opinion, but perhaps its better to switch the default nsswitch.conf
file to the one that doesn't contain "nis" as a lookup mechanism. It's much
easier to add to the "NIS/YP" section in the handbook couple lines that tell
the reader to modify /etc/nsswitch.conf to accomodate "NIS/YP" than documenting
(I can't think of any appropriate section) that whenever a system is built with
"NO_NIS=true" in the make file, the user should modify the /etc/nsswitch.conf
to accomodate the change. I realized that it's entirely my fault for not
looking forward to the impact of "NO_NIS=true", but still, I consider the above
described approach better.
Andrew
> > > > What is the contents of /etc/nsswitch.conf? bz is telling me that if you
> > > > still have 'nis' in the lines in nsswitch and you compile with NO_NIS that
> > > > you'll get wierd user lookup errors.
> > >
> > > Hmm, I completely forgot about that one. :( I guess 'nis' should have been
> > > switched to 'files' whenever system is compiled with "NO_NIS=true".
> >
> > it's not documented - sorry, will do that.
> >
> > change it to sth like:
> >
> > group: files
> > hosts: files dns
> > networks: files
> > passwd: files
> > shells: files
> >
> > w/o this change I can see sth like this when doing passwd auth:
> >
> > 'sshd[1995]: NSSWITCH(nss_method_lookup): nis, passwd_compat, endpwent, not found'
> >
> > But I suspect this will not help with your problem.
>
> Actually, that solves all the problems. Once I switched to your version of
> nsswitch.conf, all the "unknown class" bugs and multiple logging events have
> disappeared.
thanks for the information (and thanks for the lib). I will check and
see what can be done to prevent these problems in the future.