Why i found tornado with motor is much slower than tornado with pymongo

532 views
Skip to first unread message

过云峰

unread,
Dec 23, 2015, 3:53:04 AM12/23/15
to Tornado Web Server
I made two branches of my tornado app, one is sync version with pymongo, and the other is Async version with motor. Then I tested them with AB, and found the 'Async' version had bad performance compared to the 'Sync' version.

* The site is deployed on my local machine and only uses 1 process.

- This is the 'sync' version


-This is the 'Async' version


QPS:  14.5(Sync) > 4.13(Async)
Time per request: 68.984ms(Sync) < 242.28ms(Async)

This is the 'Sync' code:
class IndexHandler(BaseHandler):

# http://ip:port/
def get(self):
collection = 'show'
start_ts = helpers.today_start_timestamp()
end_ts = helpers.today_end_timestamp()
condition = {'push_time': {'$gte': start_ts, '$lt': end_ts}}
statistic = {
'total_show_number': self.db.show.count(),
'total_video_number': self.db.video.count(),
'total_push_show_number': self.db.show.find({'state': '1'}).count(),
'total_push_video_number': self.db.video.find({'state': '1'}).count(),
'today_show_number': self.db.show.find({'push_time': {'$gte': start_ts, '$lt': end_ts}}).count(),
'today_video_number': self.db.video.find({'push_time': {'$gte': start_ts, '$lt': end_ts}}).count(),
'today_push_show_number': self.db.show.find({'push_time': {'$gte': start_ts, '$lt': end_ts},
'state': '1'}).count(),
'today_push_video_number': self.db.video.find({'push_time': {'$gte': start_ts, '$lt': end_ts},
'state': '1'}).count(),
}
return self.paginate_render("index.html",
collection=collection,
condition=condition,
sort_key="push_time",
sort_direction=pymongo.DESCENDING,
statistic=statistic)

This is the 'Async' code:
class IndexHandler(BaseHandler):

PAGE_SIZE = 20

# http://ip:port/
@coroutine
def get(self):
start_ts = helpers.today_start_timestamp()
end_ts = helpers.today_end_timestamp()
shows = self.db.show.find({'push_time': {'$gte': start_ts, '$lt': end_ts}}).sort("push_time", -1)
show_info_list = yield helpers.format_show(shows)
total_show_number = yield self.db.show.count()
total_video_number = yield self.db.video.count()
total_push_show_number = yield self.db.show.find({'state': '1'}).count()
total_push_video_number = yield self.db.video.find({'state': '1'}).count()
today_show_number = yield self.db.show.find({'push_time': {'$gte': start_ts, '$lt': end_ts}}).count()
today_video_number = yield self.db.video.find({'push_time': {'$gte': start_ts, '$lt': end_ts}}).count()
today_push_show_number = yield self.db.show.find({'push_time': {'$gte': start_ts, '$lt': end_ts},
'state': '1'}).count()
today_push_video_number = yield self.db.video.find({'push_time': {'$gte': start_ts, '$lt': end_ts},
'state': '1'}).count()
statistic = {
'total_show_number': total_show_number,
'total_video_number': total_video_number,
'total_push_show_number': total_push_show_number,
'total_push_video_number': total_push_video_number,
'today_show_number': today_show_number,
'today_video_number': today_video_number,
'today_push_show_number': today_push_show_number,
'today_push_video_number': today_push_video_number,
}
page_size = self.PAGE_SIZE
page_current = self.get_arg("pg", 1, int)
item_total = today_show_number
if item_total % self.PAGE_SIZE:
page_total = item_total / self.PAGE_SIZE + 1
else:
page_total = item_total / self.PAGE_SIZE
self.render("index.html", shows=show_info_list, statistic=statistic, page_size=page_size,
page_current=page_current, page_total=page_total)

A. Jesse Jiryu Davis

unread,
Dec 23, 2015, 10:15:59 AM12/23/15
to python-...@googlegroups.com
It's certainly true that Motor is slower than PyMongo. Async code is slower in most programming languages, because the I/O code path is much more complex, and that complexity is particularly costly in Python because executing Python code is so slow. For a dense and worthwhile read see SQLAlchemy author Mike Bayer's article:


My response:


Motor is useful in many circumstances, but it will always be much slower than PyMongo on a benchmark like yours. You do a large number of very fast operations on a local MongoDB, which is the worst case for Motor: your code spends all its time executing logic in Tornado's event loop, and almost no time waiting for MongoDB, or the network, or executing Motor's own code. Using async for this benchmark is pure overhead.

Python 3.5's "async" / "await" will help a lot. Or, if you care more about throughput than about non-blocking features, use PyMongo and a multithreaded web server instead of Motor and Tornado.

--
You received this message because you are subscribed to the Google Groups "Tornado Web Server" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-tornad...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Zhang Alex

unread,
Jan 7, 2016, 11:19:25 PM1/7/16
to Tornado Web Server
Or ,motor on pypy could be better performance.


在 2015年12月23日星期三 UTC+8下午4:53:04,过云峰写道:

Steve Owens

unread,
Jan 31, 2016, 2:12:08 PM1/31/16
to Tornado Web Server
Good to know. Suppose you have a heterogenous service.  In some cases you are using mongo to do quick updates and in others you want to stream back large amounts of records from mongo or files.  Would tornado be a good option for that use case?  According to the introductory material tornado is supposed to be faster than multi threading but under what circumstances is this the case?

Ben Darnell

unread,
Jan 31, 2016, 5:17:04 PM1/31/16
to Tornado Mailing List
On Sun, Jan 31, 2016 at 2:12 PM, Steve Owens <steve...@gmail.com> wrote:
Good to know. Suppose you have a heterogenous service.  In some cases you are using mongo to do quick updates and in others you want to stream back large amounts of records from mongo or files.  Would tornado be a good option for that use case? 

Tornado is well suited to streaming large result sets with minimal memory footprint. The "quick updates" part doesn't exactly play to Tornado's strengths, but that doesn't make it a bad option. Not every decision needs to be made on the basis of performance (after all, there's more to be gained by using a language other than python than in your choice of python framework if performance is top priority)
 
According to the introductory material tornado is supposed to be faster than multi threading but under what circumstances is this the case?

Asynchronous architectures like Tornado are more *scalable* than threading, but performance is more complicated. I wouldn't expect Tornado to outperform a good multithreaded system on a throughput basis except in cases where the pathological behavior described in David Beazley's presentations (http://www.dabeaz.com/GIL/) comes into play (python 2 has the "old GIL" and python 3 has the "new GIL"). Tornado's scalability shines when you have lots of connections, either because connections are idle waiting on a human (long polling/websockets) or because you are limited by network resources (streaming large results)

-Ben
Reply all
Reply to author
Forward
0 new messages