Before I delve deeper into this on my own, I'm curious if anyone has any
experience with letting one process be running gevent
(single-threadedly) and then to farm out work to a subprocesses started
with the multiprocessing module? My application is not a traditional web
server and part of this legacy protocol is quite CPU-intensive (since
there is some cryptography involved) and I'm just trying to make it a
little bit more scalable.
I've done some experimentation, but when I try to send messages with a
shared multiprocessing.Queue it blocks. I have tried both with a
monkey.patch_all() and with no monkey patching.
Is this supposed to work or is there more to it?
Since all socket operations are non-blocking, I can of course start the
subprocesses on my own and then open a socket and implement some kind of
protocol to make intra-process RPC possible, but the multiprocessing
module is already there and it sorts out all the quirks on the different
platforms I need to support.
Regards,
Jan Persson
I've done some more research on my own and I can definitely say that
making multiprocess cooperative is a major undertaking. It is most
likely doable, but it would sure help to have some intrinsic knowledge
about both multiprocessing and gevent and I'm sorry to say that I'm
not up to this task.
As Alex Dong is pointing out, for server side applications a message
queueing scheme is probably the way to go right now. But that will not
help me, since the product I'm building is an off-the-shelf
application and we cannot force the customers to install an MQ on
every single machine just to avoid the GIL.
Regards
//Jan Persson
--
Jan Persson - Esentus Technology AB - www.esentus.com - +46 702 854132 (mobile)
The consumers can then be standalone processes which are spawned
independently from the producer process...
m
On Thu, Jan 06, 2011 at 02:39:54PM +0100, Jan Persson wrote:
> Thanks for all the answers.
>
> I've done some more research on my own and I can definitely say that
> making multiprocess cooperative is a major undertaking. It is most
> likely doable, but it would sure help to have some intrinsic knowledge
> about both multiprocessing and gevent and I'm sorry to say that I'm
> not up to this task.
>
> As Alex Dong is pointing out, for server side applications a message
> queueing scheme is probably the way to go right now. But that will not
> help me, since the product I'm building is an off-the-shelf
> application and we cannot force the customers to install an MQ on
> every single machine just to avoid the GIL.
>
> Regards
> //Jan Persson
>
> On Mon, Dec 27, 2010 at 20:54, Equand <equ...@gmail.com> wrote:
> > there is stdin stdout non blocking solution, you just have to google
> > gevent non blocking stdin...
> > also you might implement the same in multiprocessing
> >
> > On Dec 27, 3:49?am, Alex Dong <alex.d...@gmail.com> wrote:
> >> Jan, we have similar challenges here at trunk.ly. Once we got the link,
> >> we'll need to do some quite computational intense things including
> >> constructing a search index.
> >> Here is how we're doing it:
> >>
> >> We use redis to connect the pipelines together. So the gevent components
> >> will crawl the web and put the html into the queue. Then other
> >> multi-processing units will pick up the tasks from the queue, using redis'
> >> BLPOP/SUBSCRIBE, and do their own job. ?We have EC2 scripts to automatically
> >> start an instance and let it join the processing army with little impact on
> >> the gevent-based crawlers.
> >>
> >> We've scaled this structure out using more than 10 EC2 instances as the
> >> processor. It's been working great for us. It also limits the scope gevent
> >> gets applied.
> >>
> >> HTH,
> >> Alex
>
>
>
> --
> Jan Persson - Esentus Technology AB - www.esentus.com - +46 702 854132 (mobile)
--
Matt Billenstein
ma...@vazor.com
http://www.vazor.com/
1. Give Queue.get() a timeout and flush an empty message or time
response. If the client disconnected, the write will fail and no further
element will be requested from the stream iterator.
2. Give Queue.get() a timeout and check the socket with select. Return
from iterator if the connection was closed. Adapted from
http://bytes.com/topic/python/answers/40278-detecting-shutdown-remote-socket-endpoint
Do you consider them a good solution?
1:
@expose()
def eventstream(self):
response.headers['Content-type'] = 'text/event-stream'
response.charset = ""
q = get_queue_to_stream()
def stream():
while True:
try:
# Set Queue timeout to 1s
msg = q.get(True, 1)
yield "data: %s\n\n" % msg
except:
# No new message for 1s
yield "data: %s\n\n" % "ping"
return stream()
2:
import select
@expose()
def eventstream(self):
response.headers['Content-type'] = 'text/event-stream'
response.charset = ""
# Get the socket somehow, e.g. from the request environment
sock = request.environ['gunicorn.sock']
q = get_queue_to_stream()
def stream():
while True:
try:
# Set Queue timeout to 1s
msg = q.get(True, 1)
yield "data: %s\n\n" % msg
except:
r, w, e = select.select([sock],[],[sock], 0)
if r or e:
return
return stream()
Correct me if I'm wrong, but aren't your examples relying on
multiprocessing to use fork to create processes and to inherit file
descriptors?
Regards
//Jan Persson
On Thu, Jan 13, 2011 at 00:18, Travis Cline <travis...@gmail.com> wrote:
>
> This thread got my wheels turning a bit and I tossed together a few
> small examples:
>
> multiproc chat server https://gist.github.com/777085
> multiproc echo server https://gist.github.com/776364
>
> It appears that since multiprocessing Event and Queue implementation
> block on a semaphore in the c code which is a bit of a dead end. The
> Pipe implementation's exercising of the socket api appears
> incompatible with gevent's - also c code.
>
> I fell back to just using another StreamServer and communicating with
> a socket from the children to the parent process.
>
> Might be useful.
>
>
> Travis
--
Yes. standard pre-forking. look up 'accept before fork'.
Travis