PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME
COMMAND
6992 guest-sh 13 0 2624 2624 2348 S 0 0.4 4.1 0:00 smbd
6706 root 0 0 1312 1312 788 S 0 0.0 2.0 0:01 smbd
2.0.10 on Solaris, on the other hand, has the smbd's RSS (resident
set size) growing when they're first started, and then they grow
more during use. We're primarily concerned with the large initial
growth in RSS as each additional daemon is started, as we may have
some quite large number of clients per server...
root[csh]@huey[22]# uname -a
SunOS huey 5.8 Generic_108528-03 sun4u sparc SUNW,Ultra-60
On first starting the daemons:
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
634 root 1 0 0 3112K 1552K sleep 0:00 0.00% smbd
When one connection is made:
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
667 root 1 58 0 4584K 3536K sleep 0:00 0.23% smbd
634 root 1 40 0 3112K 1576K sleep 0:00 0.00% smbd
after 2 connections:
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
671 root 1 58 0 4256K 1984K sleep 0:00 0.03% smbd
667 root 1 58 0 4584K 3640K sleep 0:00 0.02% smbd
634 root 1 47 0 3112K 1576K sleep 0:00 0.00% smbd
After 3 connections:
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
667 root 1 58 0 4584K 3672K sleep 0:00 0.00% smbd
671 root 1 58 0 4584K 2656K sleep 0:00 0.00% smbd
681 root 1 58 0 4584K 2576K sleep 0:00 0.12% smbd
634 root 1 48 0 3112K 1576K sleep 0:00 0.03% smbd
This is a risk issue for us, because we sized the systems based
on masurements done with 2.0.7, then applied the bugfixes by rolling'
forward to 2.0.10.
It looks rather as if the fix in 2.0.9/10 has a memory leak:
I've looked at 2.0.10, but I don't see why there should be a
problem... could the authors of the security fixes help us out
with this, please?
--dave
--
David Collier-Brown, | Always do right. This will gratify
Performance & Engineering Team | some people and astonish the rest.
Americas Customer Engineering | -- Mark Twain
(905) 415-2849 | dav...@canada.sun.com
I also have another problem with the Windows 2000 backup utility and samba.
It seems when backing up 2GB+ of data to a file on a samba share with verify
on, the job will never stop and the "Remaining time" increases. Thanks.
BOOL reset_stat_cache( void )
{
static BOOL initialised;
if (!lp_stat_cache()) return True;
if (!initialised) {
initialised = True;
return hash_table_init( &stat_cache, INIT_STAT_CACHE_SIZE,
(compare_function)(strcmp));
}
hash_clear(&stat_cache);
return hash_table_init( &stat_cache, INIT_STAT_CACHE_SIZE,
(compare_function)(strcmp));
} /* reset_stat_cache */
----- Original Message -----
From: "David Collier-Brown" <dav...@canada.sun.com>
To: "Gerald Carter" <gca...@valinux.com>
Cc: <David.Col...@sun.com>; "Jeremy Allison" <jer...@varesearch.com>;
<to...@aus.sun.com>; <cr...@aus.sun.com>; <all...@sun.com>;
<samba-t...@samba.org>
Sent: Monday, August 20, 2001 1:41 PM
Subject: Re: Time-critical problem at Sun: exploding smbd memory usage
> Gerald Carter wrote:
> > Strange. There were no major changes between 2.0.7 and 2.0.10. Only
> > security fixes. Does the memory usage flatline out after a while?
> > I normally see 4.5Mb for the RSS for running smbd on Solaris 2.6 - 8.
>
> Yes, it flattens out, with a claimed size of 17MB, and
> RSS of 15 (if prstat is to be beleived!).
Four main shares: homes, printers, /net and /usr/dist
The globals are:
workgroup = AUS
netbios name = homer
server string = Samba %v, ITsamba v1.1 on %L
encrypt passwords = no
password level = 4
log level = 1
time server = Yes
os level = 200
max log size = 8000
preferred master = no
domain master = no
wins support = no
wins server = famine.aus
wins proxy = Yes
guest account = samba
local master = no
name resolve order = host wins bcast
dns proxy = yes
preserve case = yes
short preserve case = yes
default case = lower
printcap name = lpstat
printing = sysv
load printers = yes
security = user
socket options = TCP_NODELAY
lpq cache time = 0
map to guest = Bad User
interfaces = 127.0.0.1 hme0 hme1
bind interfaces only = yes
The process map looks like this:
# pmap 3886, where 3386 is the smbd serving my smbclient
3886: /opt/samba/sbin/smbd -D -l /var/log/samba/2001-08-17.log.smb
00010000 808K read/exec /opt/samba/sbin/smbd
the program's excutables
000E8000 240K read/write/exec /opt/samba/sbin/smbd
the progam's globals
00124000 13360K read/write/exec [ heap ]
this, as you might expect, is where the space
is used up.
FF000000 1024K read/write/exec/shared [ shmid=0x771 ]
a shared memory area
FF120000 24K read/exec /usr/lib/nss_files.so.1
FF136000 8K read/write/exec /usr/lib/nss_files.so.1
and a bunch of shared library code and global data...
FF140000 24K read/exec /usr/lib/nss_nis.so.1
FF156000 8K read/write/exec /usr/lib/nss_nis.so.1
FF160000 16K read/exec /usr/lib/nss_compat.so.1
FF174000 8K read/write/exec /usr/lib/nss_compat.so.1
FF180000 672K read/exec /usr/lib/libc.so.1
FF238000 24K read/write/exec /usr/lib/libc.so.1
FF23E000 8K read/write/exec /usr/lib/libc.so.1
FF250000 16K read/exec
/usr/platform/sun4u/lib/libc_psr.so.1
FF260000 16K read/exec /usr/lib/libmp.so.2
FF274000 8K read/write/exec /usr/lib/libmp.so.2
FF280000 552K read/exec /usr/lib/libnsl.so.1
FF31A000 32K read/write/exec /usr/lib/libnsl.so.1
FF322000 32K read/write/exec /usr/lib/libnsl.so.1
FF330000 8K read/write/exec [ anon ]
FF350000 40K read/exec /usr/lib/libsocket.so.1
FF36A000 8K read/write/exec /usr/lib/libsocket.so.1
FF370000 8K read/exec /usr/lib/libsec.so.1
FF382000 8K read/write/exec /usr/lib/libsec.so.1
FF390000 8K read/exec /usr/lib/libdl.so.1
FF3A0000 8K read/write/exec [ anon ]
FF3B0000 136K read/exec /usr/lib/ld.so.1
FF3E2000 8K read/write/exec /usr/lib/ld.so.1
FFBEA000 24K read/write/exec [ stack ]
total 17136K
The parent smbd is almost the same, but without the
/usr/lib/nss_nis.so.1
and /usr/lib/nss_compat.so.1 libraries and no shared memory area.
We have enough swap and real memory that a workaround is
possible, but the growth is disquieting (;-))
The smbd processes start up at 6 MB each, then grow until killed by
process limits (currently 20 MB). Max observed growth is 115 MB... within
an hour.
The growth was much slower under 2.0.7, but happens quickly under 2.0.10+
and 2.2 .
It's not too hard to test: create an smb.conf file with 2000 static shares
(squirt it out with a script & reuse the same directory path). Then watch
the memory growth. Someone with familiarity with the code, access to a
memory leak finder, and a good debugging environment should take a look at
this (i.e. not me :).
-kaf
> Date: Mon, 20 Aug 2001 12:18:17 PDT
> From: David Collier-Brown <dav...@canada.sun.com>
> Reply-To: David.Col...@sun.com
> To: Gerald Carter <gca...@valinux.com>
> Cc: Kris Desjardins <kris_de...@hotmail.com>,
> David.Col...@sun.com, Jeremy Allison <jer...@valinux.com>,
> to...@aus.sun.com, cr...@aus.sun.com, all...@sun.com,
> samba-t...@samba.org
> Subject: Re: Time-critical problem at Sun: exploding smbd memory usage
>
> Gerald Carter wrote:
> >
> > On Mon, 20 Aug 2001, Kris Desjardins wrote:
> >
> > > We ran samba 2.0.7 on Solaris 7 and had the size reach 28MB per
> > > process (200+ processes) before I had to kill -9 the parent and let
> > > the children eventually timeout and die. I upgraded to 2.0.10 and
> > > applied the patch below from a previous discussion but had no apparent
> > > effect, the processes are 10MB now and growing.
> >
> > Can you track this down to either printing or file sharing?
>
> Hmmn, good thought...
>
> This look suspicious, we have a whole whack
> of printers! 294 of them????
>
> I remember some printer issues in 2.0, and
> the amount of space they tale, I wonder if this
> is what's biting us today...
>
> --dave
>
--
| Keith Farrar | Xerox PARC CSNS | Palo Alto, CA | 650-812-4292 |
| DOMAIN: far...@parc.xerox.com | |
I don't know if this is related, but we once experienced a massive (10x)
increase in smbd (2.0.7) size which we eventually tracked down to a problem in
the smb.conf (a bad include statement) where we were causing the smb.conf to be
parsed over and over at smbd startup. This was under AIX 4.3.3. Eliminating the
re-parsing fixed the ballooning.
jer...@valinux.com on 08/20/2001 12:17:57 PM
To: far...@parc.xerox.com
cc: David.Col...@sun.com, Gerald Carter <gca...@valinux.com>, Kris
Desjardins <kris_de...@hotmail.com>, to...@aus.sun.com,
cr...@aus.sun.com, all...@sun.com, samba-t...@samba.org (bcc: Michael
E Osborne/JACADS/REC)
Subject: Re: Time-critical problem at Sun: exploding smbd memory usage
Keith Farrar wrote:
>
> It's not just the number of printers, it's the total number of shares. We
> have no printers defined, but lots of disk shares (roughly 900 on one box
> and 1500 on a second host). The servers are Sun E450s, but the same type
> of growth pattern occurs on Linux (redhat 7.1 x86).
>
> The smbd processes start up at 6 MB each, then grow until killed by
> process limits (currently 20 MB). Max observed growth is 115 MB... within
> an hour.
>
> The growth was much slower under 2.0.7, but happens quickly under 2.0.10+
> and 2.2 .
>
> It's not too hard to test: create an smb.conf file with 2000 static shares
> (squirt it out with a script & reuse the same directory path). Then watch
> the memory growth. Someone with familiarity with the code, access to a
> memory leak finder, and a good debugging environment should take a look at
> this (i.e. not me :).
Can you trigger growth by touching the smb.conf and
then hitting an smbd with a SIGHUP ?
If so, then it's smb.conf parse related.....
Jeremy.
running 2.2.1a on IRIX appears to behave badly when you touch smb.conf.
Usually it fills up the disks, I think its not limiting itself to the
max log size limit, so far it's only managed to happen during the middle
of the night so I'm not 100% on this. If you don't touch the file then
it runs fine for weeks at a time.
This is on a PDC. I think the running smbds keep writing to the old
rotated logfile smbds created after the 'touch' write to the new file,
this is unconfirmed but certainly the old file continues to grow.
Kevin
--
| Kevin Wheatley | These are the opinions of nobody |
| Technical Services Manager | and are not shared by my employers |
| Cinesite Digital Studios | |
The standadrd smb.conf file contains an include
directive for local-smb.conf, which is usually
a file which contains just a comemnt...
I tries touching the smb.conf file and sending a
HUP to the smbd process, (thanks, Jeremy) while
watching it with prstat -p 26111, and the size
headed up to a total size of 29MB and an RSS of 27
over about 80-odd HUPs.
Exiting and restating smbclient created an smbd
with a 29/27MB size, so the problem behavior hasn't
changed.
Comemnting out the printers and repeating the touch/kill
loop caused no increase at all over about 500 touch/HUPs.
There was no effect wit/without the include option.
Looks like a printers issue, whish was a known problem
in the 2.0 timeframe. We had not expected this, as the
previous tests with 2.0.7, on both Solaris and Cobalt,
did not show this growth.
> On Mon, 20 Aug 2001, Kris Desjardins wrote:
>
> > We ran samba 2.0.7 on Solaris 7 and had the size reach 28MB per
> > process (200+ processes) before I had to kill -9 the parent and let
> > the children eventually timeout and die. I upgraded to 2.0.10 and
> > applied the patch below from a previous discussion but had no apparent
> > effect, the processes are 10MB now and growing.
>
> Can you track this down to either printing or file sharing?
>
We have over 400 file shares and no printer shares setup.
# Global parameters
[global]
workgroup = TEST
netbios name = TESTER
netbios aliases = TESTER-46
server string = Tester
interfaces = zrl0 zrl1 hme0 lo0
bind interfaces only = YES
security = domain
encrypt passwords = Yes
password server = *
restrict anonymous = no
debug level = 1
log file = /usr/local/samba/var/log.%m
max log size = 1000
load printers = No
local master = No
dns proxy = No
wins server = xxx.xxx.xxx.xxx
create mask = 0700
hosts allow = xxx.xxx.xxx. xxx.xxx.xx. xxx.xxx.xxx.
strict locking = yes
deadtime = 900
keepalive = 3600
# Access to home directories
[homes]
comment = Home Directories
writeable = Yes
browseable = No
[im]
browseable = yes
guest ok = no
writable = yes
path = /im
comment = xxxx:/im
and many more like the following
[test$]
path = /test/udata1/test
valid users = test
writeable = yes
browseable = no
With Samba 2.0.7 AND ALSO with Samba 2.0.10
after each touch of smb.conf smbd increase about 2MB
but ONLY if we configure printers (70 pcs).
We use SuSE Linux 6.4 with Samba 2.0.7
and SuSE Linux 7.2 with Samba 2.0.10
Günter Wagner
g.wa...@mkg-bank.de
> Try the attached patch to fix the printer related leakage on Solaris.
> We also have about 300 printers.
Thanks! I've merge the missing lp_talloc_free() into
lp_add_one_printer() for 2.2 and HEAD.
cheers, jerry
---------------------------------------------------------------------
www.valinux.com VA Linux Systems gcarter_at_valinux.com
www.samba.org SAMBA Team jerry_at_samba.org
www.plainjoe.org jerry_at_plainjoe.org
--"I never saved anything for the swim back." Ethan Hawk in Gattaca--
| post patch:
| PID USERNAME SIZE RSS STATE PRI NICE TIME CPU
PROCESS/NLWP
| 9097 root 3568K 1752K sleep 21 0 0:00.00 0.0% smbd/1
| I had dome some earlier testing and found that before applying the
| patch, the size of the daemon was determined by the number of
shares.
Thanks, team: this is much more sane than before!
Ahhh I see... Jeremy took it back out. Nice of him to do so, but I think he's wrong. The "main
loop" doesn't clean things up after each printer is added when we're using a [printers] clause in
smb.conf. That loop occurs inside pcap_printer_fn. Per my testing, this only seems to make a
difference on Solaris. Maybe fragmentation is occuring inside their malloc() free()?
Rich B
----- Original Message -----
From: "Gerald Carter" <gca...@valinux.com>
To: "Richard Bollinger" <rabol...@home.com>
Cc: <David.Col...@sun.com>; "Michael E Osborne" <mosb...@jacads.com>; <jer...@valinux.com>;
<far...@parc.xerox.com>; "Kris Desjardins" <kris_de...@hotmail.com>; <to...@aus.sun.com>;
<cr...@aus.sun.com>; <all...@sun.com>; <samba-t...@samba.org>
Sent: Tuesday, August 21, 2001 9:56 PM
Subject: Re: Time-critical problem at Sun: exploding smbd memory usage
Odd that it's Solaris-specific: probably Linux
does something Elegant and Simple (:-))
--dave (Solaris Bigot!) c-b
> The talloc delete in the main loop should free
> any memory allocated in the tallocs inside the
> printer allocation. Why does this cause the RSS
> to grow on Solaris ?
>
> insure on Linux does not flag this as a alloc
> bug (and believe me, it would....).
Ok, lets's look at this when the conference
is over: it may be a subtle Solaris bug,
and I like to find those.
--dave
I took it out as it's not safe to do that free
inside the printer loop. It's only safe to
do that talloc delete in the main loop, outside
of any incoming smb processing.
The talloc delete in the main loop should free
any memory allocated in the tallocs inside the
printer allocation. Why does this cause the RSS
to grow on Solaris ?
insure on Linux does not flag this as a alloc
bug (and believe me, it would....).
Jeremy.
------=_NextPart_000_000F_01C12A1B.F3D52400
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Try the attached patch to fix the printer related leakage on Solaris. We also have about 300
printers.
Rich Bollinger, Elliott Company
------=_NextPart_000_000F_01C12A1B.F3D52400
Content-Type: application/octet-stream;
name="fixleaks.patch"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="fixleaks.patch"
*** ../samba-2.0.7/source.Linux/param/loadparm.c Fri Nov 17 09:24:27 2000=0A=
--- ./param/loadparm.c Sat Nov 18 00:46:50 2000=0A=
***************=0A=
*** 2711,2716 ****=0A=
--- 2711,2718 ----=0A=
if ((i=3Dlp_servicenumber(name)) >=3D 0)=0A=
string_set(&iSERVICE(i).comment,comment);=0A=
}=0A=
+ /* free up temporary memory */=0A=
+ lp_talloc_free();=0A=
}=0A=
=0A=
=
/************************************************************************=
***=0A=
*** ../samba-2.0.7/source.Linux/smbd/server.c Thu Mar 16 17:59:52 2000=0A=
--- ./smbd/server.c Fri Nov 17 22:58:15 2000=0A=
***************=0A=
*** 183,188 ****=0A=
--- 183,191 ----=0A=
fd_set lfds;=0A=
int num;=0A=
=0A=
+ /* free up temporary memory */=0A=
+ lp_talloc_free();=0A=
+ =0A=
memcpy((char *)&lfds, (char *)&listen_set, =0A=
sizeof(listen_set));=0A=
=0A=
------=_NextPart_000_000F_01C12A1B.F3D52400--
Again, thanks for this. I did not expect the fix to be so quick.
regards
tony
Richard Bollinger wrote:
>
> Try the attached patch to fix the printer related leakage on Solaris. We also have about 300
> printers.
>
> Rich Bollinger, Elliott Company
>
> ----------------------------------------------------------------------
> Name: fixleaks.patch
> fixleaks.patch Type: unspecified type (application/octet-stream)
> Encoding: quoted-printable