I enabled the Performance Monitoring feature in the CF Admin hoping to see
some useful metrics only to realize that the 'NT Performance Monitor' has been
replaced by the 'Reliability and Performance Monitor' which does not show CF in
its list of potential things to monitor. Since I'm running Standard/32bit on
2008/64bit I made sure to enable the Performance Counter DLL Host with no luck.
What I'm getting at with this effort is that I'm trying to track down why my
CF server keeps crashing without any mention of difficulty in the logs. I
initially had trouble with the server running out of threads, but increasing
the available resources killed that. I suspect I've only hidden the problem
but I need metrics, performance or otherwise, to see what's going on. Any
help, directly or by reference, is greatly appreciated.
-Jonathan
I'd ask first: are you able to use CFSTAT (from the coldfusion8/bin
directory)? That will give you the same metrics.If it doesn't work, then
perfmon won't get them, either. Just to be clear, did you restart CF after
enabling it? Don't recall if you should have to, but worth a shot.
Until that's resolved, and as for getting to the bottom of your problems,
there are also the JRun metrics that you can enable,which are mostly the same
(and more). Just google to find more on that.
You could also use tools like FusionReactor or SeeFusion, or still others
(some are free, and the commercial ones have free trials). See my list at:
ColdFusion Monitoring Tools
http://www.cf411.com/#cfmon.
Finally, as for finding answers, don't rely only on the "cf logs" (literally,
those in coldfusion8/logs). Look in the coldfusion8/runtime/logs (or those
using multiserver, look in jrun4/logs). Those often have far more useful info.
You may also see log files in the coldfusion8/runtime/bin, if you're having
crashes of the hotspot compiler. The Windows Event Viewer may also sometimes
have useful clues.
I discuss all of these and more in a presentation I did:
CF911: Tools and Techniques for CF Server Troubleshooting
http://www.carehart.org/presentations/#cf911
Hope that helps, until someone helps with the specific 2k8 challenge you're
having.
javax.servlet.ServletException: ROOT CAUSE:
java.lang.OutOfMemoryError: unable to create new native thread
I am running Sun Java HotSpot Server VM version 6u4, having changed because I
was fighting this issue under jrun and the web recommended updating one's VM.
Since this did not fix it, is it likely another update, or perhaps a hop to
WebSphere, JBoss or the like might improve my situation?
For reference Heap Min and Max are 1380 and I pass the following options:
-server -Xss128k -Dsun.io.useCanonCaches=false -XX:MaxPermSize=256m
-XX:+UseParallelGC -Dcoldfusion.rootDir={application.home}/../
-Dcoldfusion.libPath={application.home}/../lib
> Since this did not fix it, is it likely another update, or perhaps a hop to
> WebSphere, JBoss or the like might improve my situation?
>
Maybe, maybe not. I think you have to identify *why* your system is
running out of threads. If it is some programming fault that his firing
off thread after thread without cleaning up after itself, then no matter
how much computer resources your throw at it, sooner or later this
process is going to eat up all the available threads.
If your application is an intensive application being used by lots of
users and it just needs this many threads, then you may need more
systems to share the work load.
If something is hanging a process so that lots new request has to use a
new thread because none of the old ones have been release, you are still
going to run out of them no matter how many you have.
In other words, there is no system that has unlimited resources. So if
you are bumping against the limits, you need to identify why so an
appropriate resolutions can be applied. Otherwise it is just guessing.
You have identified what is bringing down the server, now you need to
monitor it to find out why, the tools mentioned before can help one do this.
Also, did any of the other info above help, in terms of getting metrics?
In the mean time, I read somewhere else that reducing the heap size would
cause the GC to be more aggressive. With 32bit CF running on top of 64bit
Windows/Java, would this be effective?
As for "java monitoring", well, don't confuse FusionReactor and SeeFusion with
jvm memory monitoring tools. That's a whole different kettle of fish. These are
indeed Java monitors, but that's not obvious to us as CFers using them. The
point is simply that they make it crystal clear what requests are running, and
help you drill into them to know why they may not be running well.
I should have pointed to another introductory resource on using FR (as a
CFer), at http://www.carehart.org/articles/#2008_6.
As for the "enterprise monitor", do you mean the CF 8 Server monitor? . And as
for it being expensive, do you mean that you have to buy Enterprise to get it?
Or by "eating your young" do you maybe mean that it can be expensive in terms
of resources? That's only if you turn on the memory tracking, in my
experience.I'll add that I also have articles at my site that I've done as
extensive introductions to the CF8 monitor.
As for your heap size quesiton, I don't yet run 64 bit so someone else may
need to chime in, but I suspect that if you're running CF in 32 mode you may
be limited to the same 2gig memory space (and therefore a heap of about
1.3-1.5gb).
But you ask if reducing the heap will make GC more aggressive. I think this
sort of conjecture sometimes is fruitless. There are differences in JVMs (so
what people say/know/read regarding 1.4 may not apply to 1.6), and what GC
agorithm is used (and what the default is in different JVM versions), and so
on. Really, you need to test things. You can observe GC processing by enabling
GC logging.
Still, I'd argue you may be heading down a wrong rabbit hole. What led you to
jump from the problem you described to GC issues? I know many articles/blogs
point to that as the first thing to investigate/tweak, but I don't agree. My
approach (and recommendation to my clients) always is to get the right
diagnostic info to tell you what's wrong, then go fix that.
In my experience, it's hardly ever code. Instead, it's nearly always
configuration, whether of CF, JRun, the JVM, the DB, the OS, the web server,
the network, etc. And honestly, it's often very simple once you get the right
diagnostics.
So what about my last question, about what the runtime out log said before the
last error? I have various things I might guess it could be, from experience,
but again lets let your system tell us what's wrong.