GeoNetwork findings

25 views
Skip to first unread message

David Winslow

unread,
Apr 14, 2011, 10:49:14 PM4/14/11
to geono...@opengeo.org
hey all,  I didn't get as far as I had hoped with geonetwork stuff today.  But in hopes of actually getting some results this week I've been poking at it a bit after leaving the office today.  Here's what I got.

Overview
I designed a few load tests, all based on variations of this terrible bash script I put together.  Basically the task was always to generate 2000 new metadata records with random IDs based on a sample CSW transaction document generated from GeoNode's template for metadata documents. The only thing that varied between requests was the ID, the rest of the metadata was identical.

I also reset GeoNetwork between runs by running it in Tomcat using the following command to start the server:

rm -rf webapps/geonetwork/ && bin/catalina.sh run

"catalina.sh run" puts tomcat in the foreground so I can just use ctrl+c to stop it, "rm -rf webapps/geonetwork/" ensures that the tomcat autodeployer will set up a fresh geonetwork at startup.

Round 1
2000 records, in series.  basically:

login;
for i in `seq 1 2000`;
do
   generate_and_upload_metadata
done


I didn't run into any problems with this (there were 2000 records in the database after it finished, Tomcat didn't crash, etc.) so I went on to 

Round 2,
adding a bit of parallelism.  (each subprocess shared the same session cookie however)

login;
for i in `seq 1 10`;
do
   ( for j in `seq 1 200`; do generate_and_upload_metadata; done ) &
done

Since I have a multicore processor, I can get a bit of a clue how GeoNetwork is doing for concurrency from just watching my CPU monitor, so I didn't do anything more scientific.  It seemed to do a pretty good job of utilizing all the resources available (no peaking at 50% CPU usage etc).

Round 3
One of the theories floating around this mailing list about GeoNetwork's stability issues was that they are caused by too many sessions (typically there are over 1000 active sessions when I look at the server stats for demo.geonode.org). So this time around I still ran several parallel subshells, but this time I created a new session cookie for each insert (2000 inserts => 2000 active sessions).

for i in seq 1 10;
do 
   ( for j in `seq 1 200`; do login; generate_and_upload_metadata; done ) &
done

Again GeoNetwork started out by maxing out all my cores, but as the number of sessions increased the CPU usage tended toward 25% (I have a quad-core system so this effectively means GN was onlly able to effectively use one thread of execution at this point.)  This time, Tomcat stopped responding before the test could complete.  Before it stopped responding entirely there were also some failed inserts (but I still got error responses back.)  Score one for the "too many sessions" theory.

Round 4
This time I ran the same script from round 3, but watched Tomcat's manager application and manually cleared sessions periodically (every 500 sessions or so.)  No problems this time around, overall CPU usage hovered around 75% and there were no failed transactions, etc.

Round 5
For completeness I also tried serializing the "2000 requests, 2000 sessions" version of the script.

for i in `seq 1 2000`;
do
    login;
    generate_and_upload_metadata
done

This didn't fare much better than the serialized version, but due to the utter lack of parallelism I wrote this entire email while it was running :)

Followup
So I poked around a bit to see what we can do to minimize the number of concurrent sessions.  Surprisingly, logging in doesn't seem to cause a session to be created, and logging out doesn't seem to cause one to be terminated.  As far as I can tell, it is actually the database access (inserting/retrieving metadata records) which actually prompts it, and it seems once a session is created it won't be destroyed unless left idle long enough to timeout.  Performing a search through the web UI does it, and so does a GetRecords request (through the CSW service endpoint.)

Therefore, I think we should work on making sure that all our access to GeoNetwork respects any Set-Cookie headers passed to us from GeoNetwork; if we don't pass a session cookie back to the server it will create a new session for each and every CSW request we make.  I think that we should block the GeoNode 1.0.1 RC (as discussed we won't release without running the candidate on demo.geonode.org for a week beforehand) on implementing these cookies in our use of OWSLib (but I expect to be able to take care of that tomorrow and avoid actually delaying the release further.)

--
David Winslow

Sebastian Benthall

unread,
Apr 15, 2011, 12:27:18 PM4/15/11
to David Winslow, geono...@opengeo.org
Bravo, David!
 
Therefore, I think we should work on making sure that all our access to GeoNetwork respects any Set-Cookie headers passed to us from GeoNetwork; if we don't pass a session cookie back to the server it will create a new session for each and every CSW request we make.  I think that we should block the GeoNode 1.0.1 RC (as discussed we won't release without running the candidate on demo.geonode.org for a week beforehand) on implementing these cookies in our use of OWSLib (but I expect to be able to take care of that tomorrow and avoid actually delaying the release further.)

+1

--
Sebastian Benthall
OpenGeo - http://opengeo.org

David Winslow

unread,
Apr 15, 2011, 1:33:00 PM4/15/11
to Sebastian Benthall, geono...@opengeo.org
Ok, looks like OWSLib is not really factored to allow what I'd like (there should be one urllib2.OpenerDirector for each CatalogueServiceWeb instance, configurable by client code).  Rather try to get master updated to work with owslib trunk and get a patch to owslib accepted today, I think we should do a monkey patch for this release with a real fix developed in coordination with the OWSLib developers soon to follow (should be in plenty of time for 1.1, especially since the synth branch is already compatible with OWSLib trunk).

Here's the diff for the monkey patch I'm proposing.  It's a little brittle (depends on loading geonode.geonetwork for owslib.csw to work the way we want) but temporary.  From my local tests, the current master branch generates 57 geonetwork sessions upon running updatelayers with an empty database, 15 more for repeated "updatelayers"s, and 2 for each time the search page is loaded.

With this patch, it is a flat one session per python process (updatelayers produces two because both the django-admin.py command and the running server connect to GeoNetwork.  Repeated calls only produce one additional session (django-admin.py doesn't remember the cookie between runs but the running server persists.)  The search page doesn't produce additional sessions either.

--
David Winslow

David Winslow

unread,
Apr 15, 2011, 1:37:39 PM4/15/11
to Sebastian Benthall, geono...@opengeo.org
HERE's the diff for the monkey patch I'm proposing: https://github.com/dwins/geonode/compare/master...csw-cookies

-d
Reply all
Reply to author
Forward
0 new messages