Concurrent access

17 views
Skip to first unread message

brucef

unread,
May 9, 2007, 7:53:13 PM5/9/07
to pydap, ra...@ssec.wisc.edu, bru...@ssec.wisc.edu
I'm having some trouble accessing my PyDAP server from multiple
processes at the same time. Are there any known issues regarding
concurrent server access?

The issues I'm seeing occur when accessing the server at the same time
from several processes. I thought it may be an issue with the way
Paste was serving the datasets so I set up Apache FastCGI to no avail.

I've included my test script along with the errors I get for each
server type. Unfortunately, my PyDAP server is behind a firewall so
I'm hoping someone has a public server+netcdf dataset to use for
testing.

Bruce Flynn
UW SSEC
Rm. 239
(608) 262.6172

Test script
=================================================================
#!/usr/bin/env sh
#
# USAGE: dap_test.sh <num processes>
#

URL='http://tine6.ssec.wisc.edu:9090/aeri/ARM_CDF/070303_SUM.nc.ascii?
timeHHMMSS[0:1:499]'

for i in `seq 1 $1`;
do
(if ! wget --quiet ${URL}; then echo failed; fi) &
done

Client side error
=============================================================
500: Internal Server Error.

Paste server side error
============================================================
Traceback (most recent call last):
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 592, in process_request_in_thread
Traceback (most recent call last):
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 592, in process_request_in_thread
File "/home/brucef/opt/Python-2.5/lib/python2.5/SocketServer.py",
line 254, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/home/brucef/opt/Python-2.5/lib/python2.5/SocketServer.py",
line 521, in __init__
File "/home/brucef/opt/Python-2.5/lib/python2.5/SocketServer.py",
line 254, in finish_request
self.handle()
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 400, in handle
self.RequestHandlerClass(request, client_address, self)
File "/home/brucef/opt/Python-2.5/lib/python2.5/SocketServer.py",
line 521, in __init__
self.handle()
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 400, in handle
File "/home/brucef/opt/Python-2.5/lib/python2.5/BaseHTTPServer.py",
line 316, in handle
self.handle_one_request()
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 395, in handle_one_request
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 268, in wsgi_execute
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
wsgilib.py", line 257, in next
File "/home/brucef/opt/Python-2.5/lib/python2.5/BaseHTTPServer.py",
line 316, in handle
self.handle_one_request()
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 395, in handle_one_request
File "/home/brucef/opt/Python-2.5/lib/python2.5/site-packages/
dap-2.2.5.9-py2.5.egg/dap/wsgi/application.py", line 78, in __iter__
self.start(status, response_headers)
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 156, in wsgi_start_response
AssertionError: Attempt to set headers a second time w/o an exc_info
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 269, in wsgi_execute
File "/home/brucef/lib/python-ez/Paste-1.3-py2.5.egg/paste/
httpserver.py", line 118, in wsgi_write_chunk
TypeError: unpack non-sequence

Apache FastCGI server side error
=========================================================
Traceback (most recent call last):
File "build/bdist.linux-i686/egg/flup/server/fcgi_base.py", line
558, in run
protocolStatus, appStatus = self.server.handler(self)
File "build/bdist.linux-i686/egg/flup/server/fcgi_base.py", line
1116, in handler
write(data)
File "build/bdist.linux-i686/egg/flup/server/fcgi_base.py", line
1059, in write
assert headers_set, 'write() before start_response()'
AssertionError: write() before start_response()
Traceback (most recent call last):
File "build/bdist.linux-i686/egg/flup/server/fcgi_base.py", line
558, in run
protocolStatus, appStatus = self.server.handler(self)
File "build/bdist.linux-i686/egg/flup/server/fcgi_base.py", line
1114, in handler
for data in result:
File "/home/brucef/opt/Python-2.5/lib/python2.5/site-packages/
Paste-1.3-py2.5.egg/paste/wsgilib.py", line 257, in next
return self.app_iter.next()
File "/home/brucef/opt/Python-2.5/lib/python2.5/site-packages/
dap-2.2.5.9-py2.5.egg/dap/wsgi/application.py", line 78, in __iter__
self.start(status, response_headers)
File "build/bdist.linux-i686/egg/flup/server/fcgi_base.py", line
1093, in start_response
assert not headers_set, 'Headers already set!'
AssertionError: Headers already set!

Rob Cermak

unread,
May 9, 2007, 8:09:26 PM5/9/07
to py...@googlegroups.com
Hi,

We have noticed this too with the paste server, etc. The only way I can
get concurrent access to work is using the Cherokee web server option and
hooking that up with the pydap server.

However, there seems to be a memory leak in there somewhere as the
process handlers grow over time.

Rob


--
Alaska Ocean Observing System
Data Manager
907-474-7948

Bruce Flynn

unread,
May 10, 2007, 11:24:56 AM5/10/07
to py...@googlegroups.com
Hi Rob,

I seem to have found a remedy, one that works for my tests at least.
I'll have to do a lot more digging around before I can determine if
this is the "right" way to do this or not. Or I'll just throw it at
the Paste folks and see what they can do with it. Maybe you might
have some insight into this. I've included a patch below for what I
did to httpserver.py in my Paste 1.3 distribution. The changes
simply include locking threaded access to the application execution.
I'm sure there's a better way to do a more fine grained locking to
increase liveliness, but a slow application is better than a broken
one at this point. :)

Also, would you be willing to share your relevant Cherokee config? I
would like to look into going that route but I haven't dealt with
Cherokee before.

Bruce


--- httpserver.py 2007-05-10 10:07:40.000000000 -0500
+++ httpserver.py.bmf 2007-05-10 10:08:00.000000000 -0500
@@ -31,6 +31,10 @@
__all__ = ['WSGIHandlerMixin', 'WSGIServer', 'WSGIHandler', 'serve']
__version__ = "0.5"

+#FIXME BMF
+import threading
+_lock = threading.RLock()
+
class ContinueHook(object):
"""
When a client request includes a 'Expect: 100-continue' header,
then
@@ -258,14 +262,16 @@
"""
Invoke the server's ``wsgi_application``.
"""
-
+ import threading
self.wsgi_setup(environ)
-
try:
+ _lock.acquire()
result = self.server.wsgi_application(self.wsgi_environ,

self.wsgi_start_response)
try:
for chunk in result:
+ #T = threading.currentThread()
+ #print T.getName(), chunk
self.wsgi_write_chunk(chunk)
if not self.wsgi_headers_sent:
self.wsgi_write_chunk('')
@@ -281,6 +287,8 @@
[('Content-type', 'text/
plain')])
self.wsgi_write_chunk("Internal Server Error\n")
raise
+ finally:
+ _lock.release()

#
# SSL Functionality

Rob De Almeida

unread,
May 10, 2007, 1:04:25 PM5/10/07
to py...@googlegroups.com
Hi, Bruce.

The server that comes with Paste (``Paste#http``) is based on the
BaseHTTPServer module from the Python standard library, and not intended
for production use -- I'm not even sure if it was designed for
concurrent access, so that could explain the problems you were having.

Paste Script comes with at least two servers that are more robust. You
can benchmark your pydap server trying the ``PasteScript#cherrypy`` or
``PasteScript#wsgiutils`` servers. Paste Script should've been installed
when you installed pydap, so you can just edit your configuration file
and replace the server string.

Even then, I've found the same problem some time ago when benchmarking
pydap with these two servers: I used the "ab" tool that comes with
Apache, and the servers complained with the "Headers already set!"
message when doing too many concurrent accesses to pydap. I'll
investigate this a little more today and see if I can find a solution.

The pydap test server (http://test.pydap.org) runs behind Cherokee
through a SCGI adapter. I'm sending you my configuration (split in 3
different files), but it's basically the configuration described here:

http://pydap.org/docs/server.html

I think it would be interesting to take a look at Rob Cermak's setup
also, since as I said the test server has a very low number of
concurrent accesses (if any).

Thanks,
--Rob (De Almeida)

test.pydap.org
server.ini
run.sh

Bruce Flynn

unread,
May 10, 2007, 2:33:49 PM5/10/07
to py...@googlegroups.com
Rob,

The Paste#http server was my first thought as a possible issue so I
switched over to using FastCGI with Apache. Since I saw the same
behavior I did the testing with the simpler Paste#http server.

I did try out the PasteScript#wsgiutils server but ran into a similar
error:

ERROR:root:Traceback (most recent call last):
File "build/bdist.linux-i686/egg/wsgiutils/wsgiServer.py", line
131, in runWSGIApp
self.wsgiWriteData (data)
File "build/bdist.linux-i686/egg/wsgiutils/wsgiServer.py", line
167, in wsgiWriteData
status, headers = self.wsgiHeaders
ValueError: need more than 0 values to unpack

I'm assuming that the same sort of thread locking around the request
writing will "fix" it, but right now I just need something that works
so I'm going to give Cherokee a try and see what I get.

Thanks,
Bruce

> ##
> ## Virtual server for test.pydap.org
> ##
> Server test.pydap.org, www.test.pydap.org {
> DirectoryIndex index.php, index.html, index.htm, index.shtml,
> cherokee.index.html
> DocumentRoot /var/www/test.pydap.org
>
> Log combined {
> AccessLog /var/log/cherokee/test.pydap.org.access
> ErrorLog /var/log/cherokee/test.pydap.org.error
> }
>
> Directory / {
> Handler scgi {
> Server localhost:8006 {
> Interpreter "/var/www/test.pydap.org/run.sh"
> }
> }
> }
> }
> [server:main]
> #use = egg:PasteScript#flup_scgi_thread
> use = egg:PasteScript#flup_scgi_fork
> host = 127.0.0.1
> port = 8006
>
> [filter-app:main]
> use = egg:Paste#httpexceptions
> next = cascade
>
> [composit:cascade]
> use = egg:Paste#cascade
> app1 = static
> app2 = thredds
> app3 = pydap
> catch = 404
>
> [app:static]
> use = egg:Paste#static
> document_root = %(here)s/data
>
> [app:thredds]
> use = egg:thredds
> root = %(here)s/data
> type = OpenDAP
> template = %(here)s/template/catalog.xml
>
> [app:pydap]
> use = egg:dap
> name = Test-Server
> root = %(here)s/data
> verbose = 0
> template = %(here)s/template
> #!/bin/bash
> export PYTHONPATH=/home/roberto/Python:/home/roberto/Python/lib/
> python2.4/site-packages
> /home/roberto/Python/paster serve /var/www/test.pydap.org/server.ini

Rob De Almeida

unread,
May 10, 2007, 4:34:14 PM5/10/07
to py...@googlegroups.com
Bruce Flynn wrote:
> The Paste#http server was my first thought as a possible issue so I
> switched over to using FastCGI with Apache. Since I saw the same
> behavior I did the testing with the simpler Paste#http server.
>
> I did try out the PasteScript#wsgiutils server but ran into a similar
> error:
>
> ERROR:root:Traceback (most recent call last):
> File "build/bdist.linux-i686/egg/wsgiutils/wsgiServer.py", line
> 131, in runWSGIApp
> self.wsgiWriteData (data)
> File "build/bdist.linux-i686/egg/wsgiutils/wsgiServer.py", line
> 167, in wsgiWriteData
> status, headers = self.wsgiHeaders
> ValueError: need more than 0 values to unpack
>
> I'm assuming that the same sort of thread locking around the request
> writing will "fix" it, but right now I just need something that works
> so I'm going to give Cherokee a try and see what I get.

I did some more tests with "ab", and I noticed that theses errors occur
with netcdf files, but not with CSV files: I can do hundreds of
concurrent accesses to the sample.csv file without problems.

In theory the logic behind the two plugins is exactly the same, so I'm
having trouble figuring where the problem is.

--Rob

Rob Cermak

unread,
May 11, 2007, 12:27:14 PM5/11/07
to py...@googlegroups.com
This also impacts the grib2 plugin as well.

I think it may just be a matter of speed. The CSV plugin is definitely
much faster than any of the rest of them. Try using like a 1 MB or
larger CSV file where it tasks the plug in for a bit.

Rob

Rob De Almeida

unread,
Sep 14, 2007, 1:06:42 PM9/14/07
to py...@googlegroups.com, ra...@ssec.wisc.edu, bru...@ssec.wisc.edu, cer...@sfos.uaf.edu
Hi, guys.

Just in case you haven't received/read the Changelog for pydap 2.2.6,
the concurrency problem is now fixed.

Thanks,
--Rob


--
Dr. Roberto De Almeida
http://dealmeida.net/
http://lattes.cnpq.br/1858859813771449

Reply all
Reply to author
Forward
0 new messages