I can increase the memory the JVM can use, but this just makes it happen
less often - it does not mean its not possible to still happen. It may
even be that some operations can never be done due to memory limits -
but its hard to know this until you get into doing the operation and
actually running out of memory.
The problem I have is that currently I have lots of things going on in
the one server. If one thread uses up too much memory, OutOfMemory
exceptions come flying from all over the place (not just the thread
doing the expensive operation). I am starting to think to make things
robust I really need to have multiple JVMs in separate processes so that
if one has a problem, it won't affect other parts of the system.
Are there any good resources or sources of information around talking
about resource management in server processes? For robustness, is it
better to segregate functionality into separate processes and introduce
communication between processes? Or are there ways to control the
resource usage (typically memory usage) of say a thread, including when
calling into third party libraries that you didn't write (and so cannot
control how much memory they use).
Thanks for any opinions/advice/references!
Alan.
Thanks for the tips, but I know exactly what the problem is and its not
a memory leak. I was trying to simplify here to keep the post short
(and not put people to sleep with gory details). At the risk of over
simplifying, let us say we get XML documents holding requests and we
need to convert to a DOM tree. The documents may be large (tens of
megabytes). The amount of memory required for the DOM tree is however
difficult to predict before you start the parse, as a document may have
very high density of tags, or it may have a low density of tags. There
may be more attributes, or may be less. Because we use a third party
XML parser (for example) its hard to predict given the input XML
document how much memory the parse tree is going to take up. In the
real application (which is not just XML) we have observed memory foot
prints from 5 times to 100 times the input document size - with no
guarantee it won't go more than 100 times. One 10mb document may
therefore need 50mb or 1gb of memory.
Further, there may concurrent requests going on in parallel. Normally
you would not get too big requests at the same time, but its certainly
possible. And there may be more than two.
We attempt to throttle the number of concurrent requests using thread
pools. We attempt to predict resource usage and hold back some
operations if we think we might run out of memory - but because its all
estimates, sometimes it still runs out because of estimation errors. We
currently have say 5 different sorts of things going on in the same
server at present (sharing some internal data structures), but the end
result is if one thing goes wrong, everything goes wrong. Ideally I
would like to limit the heap that can be used by a thread so I can give
a thread a fixed space to play in, so if that thread runs out of memory
it does not mean other threads will automatically run out of memory.
If I was writing all the code myself in something like C++, I would do
the memory management myself - allocating memory from fixed sized
pools. With Java, that control over memory management does not seem to
be available in the same way. Of course its made worse by the fact that
I want to use open source code which just uses 'new' as desired, so I
cannot (easily) get it to request memory from a pool even if supported
by Java.
So its not memory leaks I am concerned with - it is good resource
management practices in a JVM for highly threaded server applications
that I am after.
Thanks again!
Alan
Alexey
2001 Honda CBR600F4i (CCS)
1992 Kawasaki EX500
http://azinger.blogspot.com
http://bsheet.sourceforge.net
http://wcollage.sourceforge.net
____________________________________________________________________________________
Looking for last minute shopping deals?
Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping
/lift/ committer (www.liftweb.net)
SGS member (Scala Group Sweden)
SEJUG member (Swedish Java User Group)
\_____________________________________/
Sorry - I regret mentioning DOM. I am not using DOM. I just used it as
an example to make the point that I have a situation where it is hard to
predict the memory usage of code given its inputs.
The question is how to best to manage a process servicing lots of
requests (and different types of requests) where its hard to predict the
memory usage of the requests. Is there anything better than "add more
swap and cross your fingers"? Are there any good architectural patterns
to follow? We are using things like thread pools to limit levels of
concurrency, but we almost want to dynamically adapt based on current
memory usage - *if* we could work out the memory usage of a request.
For example, pause a thread that seems to be using up lots of memory,
let some of the smaller requests finish, don't start any new requests,
then let the big one through and start allowing smaller requests back in.
At present I don't know how to predict memory usage given inputs. So
maybe the question is: is there a way to (efficiently) measure memory
usage of a request. More precisely, while a thread is running, can
another thread peek in a look and work out over time the memory usage of
another thread processing a request - then a scheduling / controlling
thread has information it needs to co-ordinate things. Or similar -
just tossing up an example of what I mean.
Thanks!
Alan
> At present I don't know how to predict memory usage given inputs. So
> maybe the question is: is there a way to (efficiently) measure memory
> usage of a request. More precisely, while a thread is running, can
> another thread peek in a look and work out over time the memory usage of
> another thread processing a request - then a scheduling / controlling
> thread has information it needs to co-ordinate things. Or similar -
> just tossing up an example of what I mean.
In another response, I had given an example of managing jobs that have a
potential to bring down enclosing JVM's using an RMI broker process that starts
and stops these worker JVM's and gives them jobs. Obviously, if applied to
your problem, it would give you some ability to control allowed heap size for
each spawned worker JVM. The state of a particular JVM can be monitored
asynchronously using Runtime API to poll maximum and allocated memory numbers.
I'm not sure what you would do if you found out you were approaching the limit
of memory allocation, but you could conceivably run such a monitor thread.
Should you run out of memory, just have the broker spawn a new worker JVM with
more maximum memory allowed.
You could also attempt to make use of some predictive logic for the size of
needed memory, given the size of your XML data set. As you said, it's not a
linear correlation, but you could still try to come up with rough estimates,
provided you've taken care of OutOfMemoryError situations as described above.
So it would largely be a performance strategy.
/lift/ committer (www.liftweb.net)
> --
> _____________________________________
> / \
If I'm not mistaken, Alan doesn't have that option, since he can never be sure
if trying to load the job into a 3rd party library he's using will result in
OutOfMemoryError, which likely would put his or other libraries in a bad state.