How to make something asynchronous to be synchronized?

35 views
Skip to first unread message

一首诗

unread,
Nov 14, 2006, 7:43:04 PM11/14/06
to TurboGears
Hi all,

In my situation, I need to fetch some data from a remote server to
publish the web just like fetch data from a db.

It's an asynchronoused process because the turbogear process has to
wait until the response to come back, and it might take hundreds of
miliseconds, serveral seconds, or it might never come back.

The strait forward way to do it was to have some tight loop in the
code, but not sure if that would block the entire web site?

So, my question is,

how to make an asynchronous process synchronized without having any
tight loops?

Or how did they made it possible to make db operation just like a
function call without considering any asynchronoused problem?

BTW,

Where could I save the socket which would be used to communicate with
the remote server? It will be awful to create it each time I need it?

Jorge Godoy

unread,
Nov 14, 2006, 8:23:00 PM11/14/06
to turbo...@googlegroups.com
"一首诗" <newp...@gmail.com> writes:

> how to make an asynchronous process synchronized without having any
> tight loops?

Just call for the results and wait for it to be returned. If you have too
many calls them you'll have to support too many connections to the database
and the ammount of data being handled.

> Or how did they made it possible to make db operation just like a
> function call without considering any asynchronoused problem?

It is easier to deal with synchronous stuff than asynchronous stuff...

> Where could I save the socket which would be used to communicate with
> the remote server? It will be awful to create it each time I need it?

Explain what you're willing to do. If it is a database server you'll probably
end up with a connection pool. This makes it better to reuse idle
connections.

--
Jorge Godoy <jgo...@gmail.com>

一首诗

unread,
Dec 12, 2006, 9:58:53 AM12/12/06
to TurboGears
Hi,

This is what I really want to do, talk to a data source other than DB
by UDP.

I have to make my turbogears program talk to a remote server by UDP,
which means while handling a http request in controller.py, I have to
send out an UDP package and block the thread until the response is
received.

I have little idea of how to make this happen. Maybe I could write
another thread to send and receive packages, and wake up the waiting
thread correctly.

Do u have any suggestions? Thanks a lot?

By the way, I also need to process the binary data in UDP packages, and
it's a little complicated to use struct. Is there any more convenient
packages that you may recommend?

Thanks again!

On Nov 15, 9:23 am, Jorge Godoy <jgo...@gmail.com> wrote:


> "一首诗" <newpt...@gmail.com> writes:
> > how to make an asynchronous process synchronized without having any

> > tight loops?Just call for the results and wait for it to be returned. If you have too


> many calls them you'll have to support too many connections to the database
> and the ammount of data being handled.
>
> > Or how did they made it possible to make db operation just like a

> > function call without considering any asynchronoused problem?It is easier to deal with synchronous stuff than asynchronous stuff...


>
> > Where could I save the socket which would be used to communicate with

> > the remote server? It will be awful to create it each time I need it?Explain what you're willing to do. If it is a database server you'll probably

Arnar Birgisson

unread,
Dec 12, 2006, 10:32:17 AM12/12/06
to turbo...@googlegroups.com
On 12/12/06, 一首诗 <newp...@gmail.com> wrote:
> By the way, I also need to process the binary data in UDP packages, and
> it's a little complicated to use struct. Is there any more convenient
> packages that you may recommend?

I haven't tried this out myself, but this looks promising:
http://www.sis.nl/python/xstruct/xstruct.shtml

Sorry I can't help with the other part.

Arnar

一首诗

unread,
Dec 12, 2006, 10:50:01 AM12/12/06
to TurboGears
Oh, I've tried it.

It's built for python 1.5!

Do not why it's not updated.

Rol El

unread,
Dec 16, 2006, 8:25:51 PM12/16/06
to TurboGears
Hello,

I have the same exact question about how to use asynchronous calls in
TG.

I have a library which talks to a remote server, and it might take a
few millisecs to never i guess.. The way i talk to it is thru a
callback.

The page request comes in the controller.. then I call
GetMeTheData(myCallBack. arg1, arg2)... this function immediately
returns. Then i have to wait for the "myCallBack" to get the data
before I return from the controller. Anyone have ideas on how to do
this thru TG?

thanks.

Diez B. Roggisch

unread,
Dec 17, 2006, 11:46:53 AM12/17/06
to turbo...@googlegroups.com
Rol El schrieb:

How is that callback invoked? Is there a thread spawned to fetch the data?

Whatever happens, just put the data into the session via myCallback. You
should be able to do it like this:

class Callback(object):

def __init__(self):
# get the current users session, so we have it when invoked
from a different thread!
self.session = cherrypy.session


def __call__(self, *args, **kwargs):

self.session['GetMeTheDataResults', (args, kwargs))


Now if you want your user to be notified about the arrived data
immediately - that is the tricky part, either you do a frequent polling
triggered by javascript, or you take a look at server push technologies
like comet - but AFAIK that is not a really
good option for TG right now.

Diez

Diez B. Roggisch

unread,
Dec 17, 2006, 12:05:51 PM12/17/06
to turbo...@googlegroups.com
> def __call__(self, *args, **kwargs):
>
> self.session['GetMeTheDataResults', (args, kwargs))


that needs of course to be

self.session['GetMeTheDataResults'] = (args, kwargs)

Diez

Igor Foox

unread,
Dec 17, 2006, 12:08:53 PM12/17/06
to turbo...@googlegroups.com

Hmm, it's a bit difficult to say without knowing more on what your
application
has to do. What does GetMeTheData do? Does it spawn off another thread
that fetches the data?

It seems that what you're trying to do is make an asynchronous action
synchronous.

As far as I can see you have three options:
- Change GetMeTheData so it is synchronous, i.e. only returns after
it gets the data
instead of returning right away. Then you can just call myCallBack
yourself, or
do whatever it does in-line.

- Add some sort of locking in place so that GetMeTheData releases
some lock
and the controller method that calls it waits for that lock.

- Return from the controller immediately, and create some AJAX code
that would
run on the client and poll the server at regular intervals (say 5
seconds) and show
the user the results when they come back.

It really depends on whether you can modify the code in GetMeTheData and
on what you're trying to achieve from the user's point of view
(asynchronous or
synchronous UI).

Igor

Rol El

unread,
Dec 18, 2006, 5:05:20 PM12/18/06
to TurboGears
Hey folks. Since there is some confusion on my library, let me put out
some code.

Lets say I have a libWebData.so, and i have python bindings to it. The
way I init and use it is:

webDataInit(myCallback)
webDataGetEmployeeDetails(ID=100) # This call goes into libWebData.so
which spawns a pthread and immediately returns

Then after some time, you get a callback.. the library calls something
like:
myCallback(dataID=100, {'name':'TG', 'age':'2'})


Now the problem is if i make this sequence synchronous in my library,
i'll DoS myself in TG. The cherrypy pool only has 10 threads. So, each
thread is waiting on this data which can take however long.. So, if 10
people connect at once and are waiting for the synchronous call to
finish, no one else can access the webserver even though they are doing
nothing related to the data. As you can see, increasing the thread pool
will also be useless, because it wont server the actual purpose of
resource throttling then, and you WILL get DoS'ed under a load of users
== thread_pool_size.

So, one solution can be to order cherrypy to remove the current thread
from the thread pool until the callback is called, and in which case
the callback does something in it to put itself back into the thread
pool or something. I have no clue if there is a way for this.. If there
is, how do i implement a timeout in the case where the callback didn't
come back forever?

Igor, your solution 1, and 2 involve the threads waiting, and this
means if i have 10 threads waiting, i'll DoS myself due to the above
described thread_pool problem.

Diez, How does your callback class work? Lets say TG's index()
controller gets the request.

def index(kw, arg):
eID = int(kw['employeeID'])
webDataGetInit(diezCallback)
webDataGetEmployeeDetails(ID=100)
return <what?> # some code to tell browser to keep waiting? or may be
some JS code to poll the server

after 10 seconds lib calls back with:
diezCallback('100'. {'name':'TG'})

Now, i store this is stored into the session as you said. What happens
now? how do i access that session from JS? Does that session stay
active for ever until the web browser does the request for it? Or, is
the session reply stored into a DB/file and the callback returns.. ?
Then later on when JS checks, we do a DB query and return the result?

What if the session never got invoked or took a very long time that the
JS poll timed out.. (lets say we limit number of times we poll so
server isn't DoS'ed?)

thanks for the help

Diez B. Roggisch

unread,
Dec 18, 2006, 6:41:35 PM12/18/06
to turbo...@googlegroups.com
> Diez, How does your callback class work? Lets say TG's index()
> controller gets the request.
>
> def index(kw, arg):
> eID = int(kw['employeeID'])
> webDataGetInit(diezCallback)
> webDataGetEmployeeDetails(ID=100)
> return <what?> # some code to tell browser to keep waiting? or may be
> some JS code to poll the server
>
> after 10 seconds lib calls back with:
> diezCallback('100'. {'name':'TG'})
>
> Now, i store this is stored into the session as you said. What happens
> now? how do i access that session from JS? Does that session stay
> active for ever until the web browser does the request for it? Or, is
> the session reply stored into a DB/file and the callback returns.. ?
> Then later on when JS checks, we do a DB query and return the result?

The session is a server-side object, usually in-memory. Whatever request
hits the server and has the proper session key (maintained as a cookie,
and usually transferred using e.g. mochikit, and of course normal
http-requests) can access it. On the server-side!!!

So yes, you return a page that will instruct the browser to poll - there
isn't any other possibility right now. How this is done - via html-meta
tags (refresh) or javascript timercallbacks - that depends on your
taste. I'd prefer the latter. So you need a second method, like this:

@expose(format='json')
def poll_callback_data(self):
session = cherrypy.session
if session.has_key('GetMeDataResults'):
return dict(success=True, session['GetMeDataResults'])
return dict(success=False)


See how the success value determines if the polling was successful.


> What if the session never got invoked or took a very long time that the
> JS poll timed out.. (lets say we limit number of times we poll so
> server isn't DoS'ed?)

The session doesn't get "invoked". The callback gets. And my callback
object is just a function with some extra state - if it never gets
called, you lose the same you lose if that happens for whatever reason
with every other callback passed -a referernce count (actually, you
have one too much)

Diez

Igor Foox

unread,
Dec 18, 2006, 9:23:44 PM12/18/06
to turbo...@googlegroups.com
Hi Ron,

On 18-Dec-06, at 5:05 PM, Rol El wrote:

>
> Hey folks. Since there is some confusion on my library, let me put out
> some code.
>
> Lets say I have a libWebData.so, and i have python bindings to it. The
> way I init and use it is:
>
> webDataInit(myCallback)
> webDataGetEmployeeDetails(ID=100) # This call goes into libWebData.so
> which spawns a pthread and immediately returns
>
> Then after some time, you get a callback.. the library calls something
> like:
> myCallback(dataID=100, {'name':'TG', 'age':'2'})
>

OK, this makes sense.

>
> Now the problem is if i make this sequence synchronous in my library,
> i'll DoS myself in TG. The cherrypy pool only has 10 threads. So, each
> thread is waiting on this data which can take however long.. So, if 10
> people connect at once and are waiting for the synchronous call to
> finish, no one else can access the webserver even though they are
> doing
> nothing related to the data. As you can see, increasing the thread
> pool
> will also be useless, because it wont server the actual purpose of
> resource throttling then, and you WILL get DoS'ed under a load of
> users
> == thread_pool_size.
>
> So, one solution can be to order cherrypy to remove the current thread
> from the thread pool until the callback is called, and in which case
> the callback does something in it to put itself back into the thread
> pool or something. I have no clue if there is a way for this.. If
> there
> is, how do i implement a timeout in the case where the callback didn't
> come back forever?
>
> Igor, your solution 1, and 2 involve the threads waiting, and this
> means if i have 10 threads waiting, i'll DoS myself due to the above
> described thread_pool problem.

Yes, but the question is: what are you trying to achieve. As far as I
can
see you have 2 options:
- Make the asynchronous call synchronous, so that the controller only
returns after myCallback is run. The user's browser waits for the
response
all this time. You inevitably DoS yourself as you say.

- Keep the call asynchronous, the controller method returns right away,
so that the user gets a template telling him that the request has been
submitted, and some sort of JS status indicator. The status indicator
will use pollilng on a regular interface (10 seconds) to check if the
the request has returned. You don't DoS yourself.

Based on what you said above I think the section section makes more
sense in your case.

Igor


Reply all
Reply to author
Forward
0 new messages