Proper way to stop a Tornado

4,375 views
Skip to first unread message

Christopher Smith

unread,
Mar 1, 2011, 12:41:24 PM3/1/11
to python-tornado
I'm working on instrumenting a clean shutdown for a Tornado service.
It isn't nearly as simple as I'd first thought. It seems like what I
need to do is to stop accepting new connections while still leaving
the event loop running to process pending requests followed by an exit
when all requests cease to be processed. Is there some obvious way to
do this that doesn't involving patching the Tornado framework that I'm
just too much of a dunderhead to see?

--
Chris

Phil Plante

unread,
Mar 1, 2011, 1:30:46 PM3/1/11
to python-...@googlegroups.com
It would be nice if we could send a SIGHUP to it and have the loop stop accepting new connections.  I use supervisord for process management so it would be trivial to send the proper signal.

Ben Darnell

unread,
Mar 1, 2011, 1:43:36 PM3/1/11
to python-...@googlegroups.com, Christopher Smith
HTTPServer.stop() will make it stop accepting new connections, and then IOLoop.stop() (after a delay) will shut the whole thing down.  

-Ben

Cliff Wells

unread,
Mar 1, 2011, 3:29:47 PM3/1/11
to python-...@googlegroups.com
Like so?

def main ():
def sighup_handler (server, loop, signum, frame):
server.stop ()
loop.stop ()

server = tornado.httpserver.HTTPServer (...)
loop = tornado.ioloop.IOLoop.instance (

signal.signal (signal.SIGHUP,
lambda s, f: sighup_handler (server, loop, s, f))


Seems to work here, but I'm wondering what you meant by "after a delay".
It will happen after a delay or I need to introduce a delay?

Cliff

David Birdsong

unread,
Mar 1, 2011, 3:32:27 PM3/1/11
to python-...@googlegroups.com, Ben Darnell, Christopher Smith
I wrote up a graceful reload for a tornado app I maintained at a prior job:
http://groups.google.com/group/python-tornado/browse_thread/thread/1502f320fa33dbc0/a16ba1fbab015ac1?lnk=gst&q=birdsong#a16ba1fbab015ac1

It worked well enough since it did most of the necessary unbind, fork,
and exit. With nginx in front of it, existing and new connections were
unaffected.

Ben Darnell

unread,
Mar 1, 2011, 3:39:11 PM3/1/11
to python-...@googlegroups.com, Cliff Wells
You need some sort of delay to allow the requests in progress to finish.  Also, signal handlers introduce the same sort of issues as threads, so you need to use add_callback to make things safe.

def sighup_handler(server, loop, signum, frame):
  loop.add_callback(functools.partial(stop_server, server, loop))

def stop_server(server, loop):
  server.stop()
  loop.add_timeout(time.time() + 5.0, functools.partial(stop_loop, loop))

def stop_loop(loop):
  loop.stop()

-Ben

Christopher Smith

unread,
Mar 1, 2011, 3:55:29 PM3/1/11
to python-...@googlegroups.com, python-...@googlegroups.com, Cliff Wells
I would think the right thing to do would be the workers would exit themselves after all events had been processed.

--Chris

Cliff Wells

unread,
Mar 2, 2011, 4:51:24 AM3/2/11
to python-...@googlegroups.com
So I think what is needed is an additional method IOLoop.has_callbacks()
that can be used to see if there are pending callbacks.

I'm actually using this in an unrelated, but similar, situation where I
run a bunch of database-related callbacks in an IOLoop (part of an
install procedure) and I want to exit when they are all finished. What
I'm using right now is this:

def shutdown ():
if not ioloop._callbacks:
ioloop.stop ()
raise SystemExit

PeriodicCallback (shutdown, 1000, ioloop).start ()

This seems to work nicely, with the notable exception of relying on an
implementation detail to get the job done. Having the aforementioned
method would solve this.

Regards,
Cliff

Cliff Wells

unread,
Mar 2, 2011, 5:03:14 AM3/2/11
to python-...@googlegroups.com
An alternative method might be to have an attribute like
IOLoop.idle_callback that is called once whenever IOLoop._callbacks
becomes empty. This would allow for one-shot IOLoops (process a bunch
of callbacks then exit) as well as knowing when to exit cleanly as was
the original topic in this thread.

Regards,
Cliff

Cliff Wells

unread,
Mar 2, 2011, 5:41:47 AM3/2/11
to python-...@googlegroups.com
Bah, actually I just realized that my reliance on an implementation
detail is unreliable out of the gate due to this little bit of code from
IOLoop.start:

callbacks = self._callbacks
self._callbacks = []
for callback in callbacks:
self._run_callback(callback)

Back to the drawing board :P I was trying to get away with not
modifying Tornado itself, but that's appearing like a fairly remote
possibility at this point.

Cliff


On Wed, 2011-03-02 at 01:51 -0800, Cliff Wells wrote:

Christopher Smith

unread,
Mar 2, 2011, 9:20:44 AM3/2/11
to python-...@googlegroups.com, Cliff Wells
Well, now you are at the point that I was lead to. If nothing else
draw comfort that I reached the same conclusion myself.

--Chris

--
Chris

Ben Darnell

unread,
Mar 2, 2011, 1:29:11 PM3/2/11
to python-...@googlegroups.com, Cliff Wells
On Wed, Mar 2, 2011 at 1:51 AM, Cliff Wells <cl...@develix.com> wrote:
So I think what is needed is an additional method IOLoop.has_callbacks()
that can be used to see if there are pending callbacks.

Why limit yourself to just callbacks?  Handlers and timeouts also mean that the IOLoop has future work pending.  They are also why trying to exit when the IOLoop is "idle" is probably not the right thing to do.  It's possible to ignore the incidental implementation details (like the IOLoop's own waker pipe), and add a daemon flag to timeouts (like the daemon flag on threads, so that things like autoreload won't keep the IOLoop perpetually busy), but it gets kind of messy.

I think maybe what's needed here is not an idle callback but a conditional callback.  The condition would be evaluated each time through the IOLoop (without touching the waker pipe, so it's not just busy-waiting.  This is why it would require IOLoop modifications instead of just using the current add_callback).  When it returns true the real callback is called and the conditional callback is removed.

A conditional callback would be more expensive than other ways of solving the problem since the condition would be evaluated frequently, but at least for the clean-shutdown case it feels cleaner to attach the logic to the IOLoop like this than to have each handler call a method when it finishes that will trigger a shutdown if necessary (although you'd still need to decrement an active counter or something, so it's not completely non-invasive).  

-Ben

Cliff Wells

unread,
Mar 2, 2011, 4:35:05 PM3/2/11
to python-...@googlegroups.com

Yeah I'm not sure it's worth it. For the clean shutdown case, waiting
an arbitrary amount of time is probably acceptable, if not pretty. For
the case I was just dealing with (a one-shot IOLoop) I ended up
refactoring enough during the learning process that I no longer need
that process to be a one-shot deal (and hey, it's better), so the issue
is moot there.

As much as I dislike arbitrary constants, I don't think this rarely used
case justifies adding any significant overhead, unless the feature can
be leveraged to cover more interesting and frequently-used problems.
I'm currently not thinking of any.

Regards,
Cliff


Liu Guangtao

unread,
Nov 7, 2014, 7:55:09 AM11/7/14
to python-...@googlegroups.com
Here is my solution and it comes from all you have supplied infomation.
Attention:
       I use PeriodicCallback to check the memory usage and after run the PeriodCallback  callback function the application will gracefully restart itself when memory trigger the limit.

def start_httpserver(app_root, handler_map, opts):
    # create app and http server
    app = Application(handler_map, app_root, opts)
    http_server = tornado.httpserver.HTTPServer(app, xheaders=True)
    ioloop = tornado.ioloop.IOLoop.instance()
    
    def mem_check_scheduler():
        easylog.debug("Call Check MEM")
        rss_mem = systool.mem('rss') / 1024
        if rss_mem > app.sandbox_conf['restart_mem']:
            easylog.error("RSS: %d MB Need to restart", rss_mem)
            os.kill(os.getpid(), signal.SIGUSR1)
            
    mem_check_scheduler = tornado.ioloop.PeriodicCallback(mem_check_scheduler,
                                                          1000 * app.sandbox_conf['mem_check_sec'])
    mem_check_scheduler.start()
    
    def restart_self():
        easylog.error("Stopping the server")
        http_server.stop()  # dont accept the new request
        mem_check_scheduler.stop()
        
        deadline = time.time() + 10
        def stop_loop():
            now = time.time()
            easylog.error("wait for ioloop finsh. %d", now)
            if now < deadline and ioloop._callbacks:
                ioloop.add_timeout(now+1, stop_loop)
            else:
                ioloop.stop()
                easylog.error("IOLOOP stopped")
                #restart code comes from autoreload.py in tornado
                try:
                    os.execv(sys.executable, [sys.executable] + sys.argv)
                except OSError:
                    # Mac OS X versions prior to 10.6 do not support execv in
                    # a process that contains multiple threads.  Instead of
                    # re-executing in the current process, start a new one
                    # and cause the current process to exit.  This isn't
                    # ideal since the new process is detached from the parent
                    # terminal and thus cannot easily be killed with ctrl-C,
                    # but it's better than not being able to autoreload at
                    # all.
                    # Unfortunately the errno returned in this case does not
                    # appear to be consistent, so we can't easily check for
                    # this error specifically.
                    os.spawnv(os.P_NOWAIT, sys.executable,
                              [sys.executable] + sys.argv)
                    sys.exit(0)
        stop_loop()
        
    def shutdown():
        easylog.error("Stopping the server")
        http_server.stop()  # dont accept the new request
        mem_check_scheduler.stop()
        deadline = time.time() + 10
        def stop_loop():
            now = time.time()
            easylog.error("wait for ioloop finsh. %d", now)
            if now < deadline and ioloop._callbacks:
                ioloop.add_timeout(now+1, stop_loop)
            else:
                ioloop.stop()
                easylog.error("IOLOOP stopped")
                sys.exit(1)
        stop_loop()
        
    
    def exit_handler(sig, frame):
        easylog.error("#### Caught signal:%d will exit ####", sig)
        ioloop.spawn_callback(shutdown)
    
    signal.signal(signal.SIGTERM, exit_handler)
    signal.signal(signal.SIGINT, exit_handler)
    signal.signal(signal.SIGABRT, exit_handler)
    
    def restart_handler(sig, frame):
        easylog.error("#### Caught signal:%d will restart self ####", sig)
        ioloop.spawn_callback(restart_self)
        
    signal.signal(signal.SIGPIPE, restart_handler)
    signal.signal(signal.SIGUSR1, restart_handler)
    
    # start http server
    if opts.debug == True:
        http_server.listen(opts.port, address=opts.address)
    else:
        http_server.bind(opts.port, address=opts.address)
        http_server.start(num_processes=opts.process)
    easylog.info('listen @ %s:%d' % (opts.address, opts.port))
    
    ioloop.start()
  



在 2011年3月2日星期三UTC+8上午1时41分24秒,Christopher Smith写道:
Reply all
Reply to author
Forward
0 new messages