import pymongo conn = pymongo.Connection('localhost',27017) db = conn.myproj def xx(i): a=time.time() d = list(db.app_login_logout_log.find()) print 'time:',time.time() - a, 'count:',len(d) for i in range(1): time.sleep(0.1) t = threading.Thread(target=xx, args=(i,)) t.start() time.sleep(3) print 'Three thread each time spent at the same time:' for i in range(3): time.sleep(0.1) t = threading.Thread(target=xx, args=(i,)) t.start() out: time: 1.94099998474 count: 479301 Three thread each time spent at the same time: time: 6.62199997902 count: 479301 time: 6.79799985886 count: 479301 time: 6.9240000248 count: 479301 Multithreaded queries used it for longer periods of time, whether to have optimized code?
Hello,
The reason for the threaded performance you are seeing is due to Python’s Global Interpreter Lock, which allows only one thread to perform a computation.
In your code, the “computation” is the line
d = list(db.app_login_logout_log.find())
where the GIL is held by the thread that is converting BSON into Python data structure.
The GIL is not held for IO, but since the mongod
you’re connected to is in localhost
, comparatively little time is spent doing IO.
Using the multiprocessing module allows the program to scale up a bit better. For example, by modifying the code a little to be:
import pymongo
import sys
import time
from multiprocessing import Process
def xx(i):
conn = pymongo.MongoClient('localhost',27017)
db = conn.test
print i, 'started'
a = time.time()
d = list(db.test.find().limit(100000))
print i,'finished. time:',time.time() - a
procs = [Process(target=xx, args=(i,)) for i in range(int(sys.argv[1]))]
start = time.time()
for p in procs:
p.start()
for p in procs:
p.join()
print 'all done: %.2f' % (time.time() - start)
The output shows better scaling vs. using threads:
$ python foo.py 1
0 started
0 finished. time: 0.338715076447
all done: 0.35
$ python foo.py 10
0 started
1 started
2 started
3 started
4 started
5 started
6 started
7 started
8 started
9 started
4 finished. time: 1.09985303879
5 finished. time: 1.10895490646
0 finished. time: 1.11398696899
8 finished. time: 1.12296009064
3 finished. time: 1.13394904137
6 finished. time: 1.13271999359
2 finished. time: 1.13802504539
9 finished. time: 1.13268995285
7 finished. time: 1.13850402832
1 finished. time: 1.15410399437
all done: 1.17
For more information, please see Thread State and the Global Interpreter Lock.
Best regards,
Kevin