[AOLSERVER] Getting a handle on memory usage

21 views
Skip to first unread message

Titi Alailima

unread,
Apr 30, 2008, 9:40:14 AM4/30/08
to AOLS...@listserv.aol.com

What are the best ways of figuring out how the memory usage in AOLserver is broken down?  I’m not sure if I even know what all the main memory consumers (assuming normal leak-free operation).  Here’s what I know about:

·         nscache – I figure going through all the keys and adding up the sizes of the keys and values would give me a pretty accurate count.  Is there any extra overhead for the number of threads or is this basically a fixed value no matter how many threads you have.

·         nsv arrays – Not knowing how they are stored, I don’t know how much memory they use.  I could go through all the arrays and add up the sizes of the string representations of the arrays, but I’m guessing it’s stored much more efficiently than that.

·         Threads – global variables, procedures, what have you.  Is there any way to get this usage per thread, and break it down ideally?

·         Anything else?

 

Titi Ala'ilima

Lead Architect

MedTouch LLC

1100 Massachusetts Avenue

Cambridge, MA 02138

617.621.8670 x309

 

-- AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <list...@listserv.aol.com> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.

Dossy Shiobara

unread,
Apr 30, 2008, 3:09:59 PM4/30/08
to AOLS...@listserv.aol.com
On 2008.04.30, Titi Alailima <ti...@MEDTOUCH.COM> wrote:
> What are the best ways of figuring out how the memory usage in
> AOLserver is broken down? [...]

There's lots of overheads all over the place. Each Tcl_Obj structure,
every Tcl_DString buffer ... the AOLserver driver thread and all its
structures used to manage state ...

All these little things should (in theory) be inconsequential compared
to the larger picture of the application's memory usage, but there's a
good chance there are a few lost pointer leaks, etc.

--
Dossy Shiobara | do...@panoptic.com | http://dossy.org/
Panoptic Computer Network | http://panoptic.com/
"He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on." (p. 70)

Rick Cobb

unread,
Apr 30, 2008, 5:18:06 PM4/30/08
to AOLS...@listserv.aol.com
Do we have any test workloads we can use to prove that and isolate the
problem down?

Thanks --
-- ReC

Dossy Shiobara

unread,
Apr 30, 2008, 6:23:13 PM4/30/08
to AOLS...@listserv.aol.com
On 2008.04.30, Rick Cobb <rc...@KNOWNOW.COM> wrote:
> Do we have any test workloads we can use to prove that and isolate the
> problem down?

You have no idea how much I'd love to have someone who's overly anal
retentive--I mean, exhaustively thorough--who has a strong desire to
create such a workload to measure and isolate any memory growths.

Sadly, I'm not that person. :-( I don't care _that_ much. This would
be a great opportunity for someone who would like to help out ...

aT

unread,
May 1, 2008, 2:18:01 AM5/1/08
to AOLS...@listserv.aol.com
Dossy Shiobara wrote:
On 2008.04.30, Rick Cobb <rc...@KNOWNOW.COM> wrote:
  
Do we have any test workloads we can use to prove that and isolate the
problem down?
    
You have no idea how much I'd love to have someone who's overly anal
retentive--I mean, exhaustively thorough--who has a strong desire to
create such a workload to measure and isolate any memory growths.

Sadly, I'm not that person.  :-(  I don't care _that_ much.  This would
be a great opportunity for someone who would like to help out ...

  
I have always banged my head in vein , trying to figure out what exactly causes the growth in aolserver process memory size ,
I am not an expert  expert in debugging these issues and  there are so many areas which could potentially contribute in growing nsd size ,
tcl version , nsoracle , nsmysql  .

I still keep trying new TCL versions ,  tried with google perftools , no success,  the process keeps growing .

If somebody needs a help in any regarding once for all identifying main reasons for this issue , i am willing to contribute in the effort with my resources and expertise  .

We have  x86_64 and x86 environment with oracle 9i and mysql 4.1.xx , we also have an old acs 3.4 version customized to our need .

Regards




-- 
Syed Atif Ali
D. +971 4 3911914
F. +971 4 3911915
___________________________________________ 
"Resistance is futile. Open your source code and prepare for assimilation."

Titi Alailima

unread,
May 1, 2008, 12:30:43 PM5/1/08
to AOLS...@listserv.aol.com
This isn't a very helpful response. I'm not even asking about ongoing memory growth and overhead, at least not yet. I want to know what are the parts that we (or at least some of us) _do_ understand and how can we at least calculate those things so that:
1. we even have a sense of how much memory is being eaten up by things we don't know
2. we know at least roughly what sort of impact we can make by adjusting the parts we do know

As for these test workloads, can someone enlighten me more as to what they might look like and how they would be used? Depending on what is involved, I or someone I know may be able to figure something out to get us started in the right direction.

Titi Ala'ilima
Lead Architect
MedTouch LLC
1100 Massachusetts Avenue
Cambridge, MA 02138
617.621.8670 x309


> -----Original Message-----
> From: AOLserver Discussion [mailto:AOLS...@LISTSERV.AOL.COM] On
> Behalf Of Dossy Shiobara
> Sent: Wednesday, April 30, 2008 3:10 PM
> To: AOLS...@LISTSERV.AOL.COM
> Subject: Re: [AOLSERVER] Getting a handle on memory usage
>

Fenton, Brian

unread,
May 1, 2008, 1:29:49 PM5/1/08
to AOLS...@listserv.aol.com
Hi Titi,

I know Gustaf Neumann had a script to find "application level leaks". If he doesn't respond I can post it.

Brian

Alex

unread,
May 1, 2008, 2:07:49 PM5/1/08
to AOLS...@listserv.aol.com
Please do - I for one would be extremely interested in seeing such script.

Thanks,
~ Alex.

On Thu, May 1, 2008 at 1:29 PM, Fenton, Brian <Brian....@quest.ie> wrote:
> Hi Titi,
>
> I know Gustaf Neumann had a script to find "application level leaks". If he doesn't respond I can post it.
>
> Brian
>

Maurizio Martignano

unread,
May 1, 2008, 2:37:32 PM5/1/08
to AOLS...@listserv.aol.com
Dear all,
Sometime ago I made some tests on the memory leaks of
aolserver+tcs+openacs.

I'd like to share my two cents worth of what I believe I found, hoping it
might help.

Aolserver uses TCL, and TCL's got three different memory managers:

1. Standard: using the malloc, free, etc... primitives of the OS (not so
fast and not using a lot of memory)
2. Zippy: multithreaded primitives specially developed for TCL (very fast
but using plenty of memory - this is the default mechanism)
3. VTMalloc: another multithreaded implementation, developed by third
parties, and somehow in between Standard and Zippy for what concerns
efficiency and memory occupation.

When using Zippy I noticed that the memory occupied by Aolserver (nsd)
always grows, without being released.

On the contrary when using the Standard memory allocator every now and then
memory gets released when not used. The total memory occupied by Aolserver
keeps growing but at a slower pace that in the previous case.

How to enable TCL Standard memory allocator?

Just call the following configure command:

configure -enable-threads

Then edit the Makefile, look for AC_FLAGS and remove from it the
-DUSE_THREAD_ALLOC=1 define (i.e. delete the string "-DUSE_THREAD_ALLOC=1").

Compile everything, and see how it goes....

With this change in place, I've seen some saving on the memory occupied by
nsd (around 30%).

Another point to look at very carefully is the overall system configuration
(config.tcl):
Parameters like

maxconnections
maxdropped
maxthreads
minthreads
threadtimeout

and

maxidle
maxopen
connections 1

have quite an effect on the total occupied memory.

Here there's no general recipe. My suggestion would be try to use the
smallest possible values that still allow the system to work.

Hope it helps,
Maurizio

Dossy Shiobara

unread,
May 1, 2008, 3:11:16 PM5/1/08
to AOLS...@listserv.aol.com
On 2008.05.01, Maurizio Martignano <Maurizio....@ACM.ORG> wrote:
> When using Zippy I noticed that the memory occupied by Aolserver (nsd)
> always grows, without being released.

That's probably because we don't invoke madvise() anywhere. Of course,
apparently on MacOS X madvise() is broken, and on Win32 we should use
VirtualFree(). See this entry about jemalloc where this issue is
explored:

Perceived jemalloc memory footprint
http://www.canonware.com/~ttt/2008/01/perceived-jemalloc-memory-footprint.html

I mentioned on the IRC chat that it might be fun to implement a very
fine-grained debugging memory allocator implementation that keeps lots
of statistics that we can examine and is designed with knowledge that
it's debugging an AOLserver process to record the relevant bits of
information so we can correlate it back to an activity that the code is
performing.

I don't know if this is a worthwhile idea, but it was fun to think about
for a moment, anyway.

Vlad Seryakov

unread,
May 1, 2008, 3:31:22 PM5/1/08
to AOLS...@listserv.aol.com
Let me add my 2 cents as well as i have spent a lot of time trying to
deal with ever-growing nsd process.

Currently i have nsd process (Naviserver) with size about 500Mb running
without restart for 15 days.

Tcl compiled without Zippy, with standard malloc-family. I tried it with
vtmalloc as well, the difference is not big, it gets shrinked faster but
overall size-wise it is similar to just standard Linux malloc.

The only bug change i had to make to Naviserver is to intriduces timeout
and max number of jobs to execute to the Tcl job facility and to
scheduling thread. Conn threads are controlled by maxconn and etc, DB
handles have idle and config parameters when to close connections. the
only missing part was to let sched and tcljob cleanup their resources.

For example i use a lot ns_job and sch_schedXXX, that measn all Tcl
interps allocted by sched and job threads stay forever and grow
unconditionally.

But with limits and config parameters that tell those threads when to
exit work perfectly. Hope those changes may be useful for aolserver code
as well.

Fenton, Brian

unread,
May 2, 2008, 4:22:25 AM5/2/08
to AOLS...@listserv.aol.com
Gustaf originally posted his script here:
http://thread.gmane.org/gmane.comp.web.aolserver/12665/focus=12667

For my OS version, I had to change the ps command to:
set ps [exec ps xv | grep "[pid] "]

regards
Brian

-----Original Message-----
From: AOLserver Discussion [mailto:AOLS...@LISTSERV.AOL.COM] On Behalf Of Alex
Sent: 01 May 2008 19:08
To: AOLS...@LISTSERV.AOL.COM
Subject: Re: [AOLSERVER] Getting a handle on memory usage

Maurizio Martignano

unread,
May 2, 2008, 4:47:55 AM5/2/08
to AOLS...@listserv.aol.com
Dear all,
If a proper analysis of the memory consumption is of interest to
anyone, it could be an idea to use Valgrind/Memcheck: http://valgrind.org/

Anyhow, before going over such detailed and time consuming analysis, I first
would try to:

1. use the TCL Standard memory allocator (and not the Zippy one)
2. properly configure the parameters in the 'config.tcl'

Cheers,
Maurizio


-----Original Message-----
From: AOLserver Discussion [mailto:AOLS...@LISTSERV.AOL.COM] On Behalf Of
Dossy Shiobara
Sent: 01 May 2008 21:11
To: AOLS...@LISTSERV.AOL.COM
Subject: Re: [AOLSERVER] Getting a handle on memory usage

Rick Cobb

unread,
May 2, 2008, 1:46:44 PM5/2/08
to AOLS...@listserv.aol.com
I hate to just 'me too' on these discussions, but I agree. In both 3.4.2
and 4.5 (the only two AOLServers we have any experience with here), our
typical experience has been that leaks were in our application area, and
many of the most nasty ones to find were in long-lived threads. We've
tried to find them using Purify, valgrind, and a stack-trace we built
into the Windows memory allocator for 3.4.2; of these tools, valgrind
has been the most useful.

Just to add something to the discussion: One point I think is confusing
about the C API, and needs more amplification in the documentation, is
the asymmetry between Ns_TclAllocateInterp and Nn_TclDeAllocateInterp.
In our C/C++ modules, we often need to "get the current interp and run
this Tcl command", which may cause recursion back into our module(s).
And our module can be used from either conn threads or non-conn (timer,
multicast reception, etc) threads. We don't want to leak Interps, but we
also don't want to leak ns_sets and other interp-specific Tcl contexts
that should be cleaned up between requests.

We tried to do this by only allocating the Interp on demand, but then
the question of when to deallocate was very confusing. You don't want to
just Alloc at the beginning of a block, and then DeAlloc at the end, the
way a C++ programmer would usually think. In fact, that usually breaks
things, because the Alloc didn't actually Alloc anything, but the
DeAlloc cleans everything up, so ns_sets and Tcl globals that were
perfectly fine before you called into our module would just "disappear".

Instead, you have to grok the AOLServer driver/conn model, and have your
timer/receiver/whatever thread somehow know whether the demand was made,
and then deallocate only as the complete request context goes out of
scope.

So: I hate the names of those functions. Symmetric names should imply
symmetric operations.

In the end, we constructed our own "KnContext" stack (in C++) we could
use symmetrically. To help ourselves out with performance monitoring,
we instrumented that so we could see leaks of interps, and also count
number of allocated sets (well, last set# allocated) and response time
thresholds as we unwound them.

(Of course, we put all that in our own modules, not the core AOLServer,
so we don't get those benefits automatically when using adps or
registered Tcl procs; there's probably some way to pull that off with
traces, but we were never motivated to try.)

-- ReC

Stephen Deasey

unread,
May 2, 2008, 2:54:50 PM5/2/08
to AOLS...@listserv.aol.com
On Fri, May 2, 2008 at 6:46 PM, Rick Cobb <rc...@knownow.com> wrote:
>
> Just to add something to the discussion: One point I think is confusing
> about the C API, and needs more amplification in the documentation, is
> the asymmetry between Ns_TclAllocateInterp and Nn_TclDeAllocateInterp.
> In our C/C++ modules, we often need to "get the current interp and run
> this Tcl command", which may cause recursion back into our module(s).
> And our module can be used from either conn threads or non-conn (timer,
> multicast reception, etc) threads. We don't want to leak Interps, but we
> also don't want to leak ns_sets and other interp-specific Tcl contexts
> that should be cleaned up between requests.


Allocate single interp per-server per-thread:

http://sourceforge.net/tracker/index.php?func=detail&aid=1241351&group_id=130646&atid=719006


Here's the history of nsd/tclinit.c for naviserver and aolserver, if
you're looking to compare/port etc. Both have seen lots of changes but
have been kept reasonably well in synch.

http://freehg.org/u/groks/naviserver/log/fad41c08d28b/nsd/tclinit.c
http://freehg.org/u/groks/aolserver/log/9a2383ff0ca5/nsd/tclinit.c


(The above are cleaned up conversions of the cvs repos. It's easier to
browse history)

Reply all
Reply to author
Forward
0 new messages