hey all, I didn't get as far as I had hoped with geonetwork stuff today. But in hopes of actually getting some results this week I've been poking at it a bit after leaving the office today. Here's what I got.
Overview
I designed a few load tests, all based on variations of this terrible bash script I put together. Basically the task was always to generate 2000 new metadata records with random IDs based on a sample CSW transaction document generated from GeoNode's template for metadata documents. The only thing that varied between requests was the ID, the rest of the metadata was identical.
I also reset GeoNetwork between runs by running it in Tomcat using the following command to start the server:
rm -rf webapps/geonetwork/ && bin/catalina.sh run
"catalina.sh run" puts tomcat in the foreground so I can just use ctrl+c to stop it, "rm -rf webapps/geonetwork/" ensures that the tomcat autodeployer will set up a fresh geonetwork at startup.
Round 1
2000 records, in series. basically:
login;
for i in `seq 1 2000`;
do
generate_and_upload_metadata
done
I didn't run into any problems with this (there were 2000 records in the database after it finished, Tomcat didn't crash, etc.) so I went on to
Round 2,
adding a bit of parallelism. (each subprocess shared the same session cookie however)
login;
for i in `seq 1 10`;
do
( for j in `seq 1 200`; do generate_and_upload_metadata; done ) &
done
Since I have a multicore processor, I can get a bit of a clue how GeoNetwork is doing for concurrency from just watching my CPU monitor, so I didn't do anything more scientific. It seemed to do a pretty good job of utilizing all the resources available (no peaking at 50% CPU usage etc).
Round 3
One of the theories floating around this mailing list about GeoNetwork's stability issues was that they are caused by too many sessions (typically there are over 1000 active sessions when I look at the server stats for
demo.geonode.org). So this time around I still ran several parallel subshells, but this time I created a new session cookie for each insert (2000 inserts => 2000 active sessions).
for i in seq 1 10;
do
( for j in `seq 1 200`; do login; generate_and_upload_metadata; done ) &
done
Again GeoNetwork started out by maxing out all my cores, but as the number of sessions increased the CPU usage tended toward 25% (I have a quad-core system so this effectively means GN was onlly able to effectively use one thread of execution at this point.) This time, Tomcat stopped responding before the test could complete. Before it stopped responding entirely there were also some failed inserts (but I still got error responses back.) Score one for the "too many sessions" theory.
Round 4
This time I ran the same script from round 3, but watched Tomcat's manager application and manually cleared sessions periodically (every 500 sessions or so.) No problems this time around, overall CPU usage hovered around 75% and there were no failed transactions, etc.
Round 5
For completeness I also tried serializing the "2000 requests, 2000 sessions" version of the script.
for i in `seq 1 2000`;
do
login;
generate_and_upload_metadata
done
This didn't fare much better than the serialized version, but due to the utter lack of parallelism I wrote this entire email while it was running :)
Followup
So I poked around a bit to see what we can do to minimize the number of concurrent sessions. Surprisingly, logging in doesn't seem to cause a session to be created, and logging out doesn't seem to cause one to be terminated. As far as I can tell, it is actually the database access (inserting/retrieving metadata records) which actually prompts it, and it seems once a session is created it won't be destroyed unless left idle long enough to timeout. Performing a search through the web UI does it, and so does a GetRecords request (through the CSW service endpoint.)
Therefore, I think we should work on making sure that all our access to GeoNetwork respects any Set-Cookie headers passed to us from GeoNetwork; if we don't pass a session cookie back to the server it will create a new session for each and every CSW request we make. I think that we should block the GeoNode 1.0.1 RC (as discussed we won't release without running the candidate on
demo.geonode.org for a week beforehand) on implementing these cookies in our use of OWSLib (but I expect to be able to take care of that tomorrow and avoid actually delaying the release further.)
--
David Winslow