Python does support threading (albeit not all *that* thoroughly, but I
think it handles file i/o ok). It's not too tough to put together a
simple work queue using e.g. the queue module:
http://docs.python.org/library/queue.html
If that doesn't work for you, providing some details as to why might
lead the way to a more helpful answer.
-josh
Python does support threading (albeit not all *that* thoroughly, but I> Is there any way to set up a "producer thread, consumer thread" scenario for
> use with theano? Most of my work involves datasets where the individual
> examples are very large, and a significant amount of the time it takes to
> process them is just file I/O. It would be nice if I could hide some of that
> latency by having one thread load examples while another thread does
> stochastic gradient descent on the loaded examples and then discards them.
> It seems like I'm out of luck because there is no such thing as threading in
> python. Is there any way of working around that restriction?
think it handles file i/o ok). It's not too tough to put together a
simple work queue using e.g. the queue module:
http://docs.python.org/library/queue.html
What exactly do you mean by block I/O? Do you mean you think a call to a C function in one thread might prevent an I/O operation from taking place in another thread?
From what I've read in the last few minutes, it sounds like while threads are supported, every subexpression that touches a python object needs to get the global lock. This lock can be dropped during some I/O operations, but presumably initiating each I/O operation requires touching the python object that you want the result written to. This means you would not be able to start any new I/O operations while your theano function is running (the C code for theano ops definitely touches python objects without acquiring any lock so I'm assuming we must hold the lock throughout the entire execution of the theano function). Do you think that could be what prevented you from getting performance improvements, Dumitru?
On Fri, Oct 15, 2010 at 5:21 PM, Dumitru Erhan <dumitr...@gmail.com> wrote:On Fri, Oct 15, 2010 at 16:46, Josh Bleecher Snyder <josh...@gmail.com> wrote:Python does support threading (albeit not all *that* thoroughly, but I> Is there any way to set up a "producer thread, consumer thread" scenario for
> use with theano? Most of my work involves datasets where the individual
> examples are very large, and a significant amount of the time it takes to
> process them is just file I/O. It would be nice if I could hide some of that
> latency by having one thread load examples while another thread does
> stochastic gradient descent on the loaded examples and then discards them.
> It seems like I'm out of luck because there is no such thing as threading in
> python. Is there any way of working around that restriction?
think it handles file i/o ok). It's not too tough to put together a
simple work queue using e.g. the queue module:
http://docs.python.org/library/queue.html
Do calls to C functions in Python block I/O (I had read that somewhere, but I might have gotten this wrong)? I tried doing what Ian wants (with this particular module), but I've never had any luck with actually gaining performance. Never investigated too much, though.Dumitru
Hmmm...bummer.
Another option to look into is the multiprocessing module:
http://docs.python.org/library/multiprocessing.html -- basically
multithreading but via processes, thus avoiding the GIL. It looks like
it might offer a decent alternative, as long as the IPC doesn't prove
to be too slow and/or the shared memory facilities not helpful.
-josh
There was an example in Brian Granger's HPC Tutorial at Scipy'10. If
you can't find it, let me know.
HTH
N
--
Nicolas Pinto
Ph.D. Candidate, Brain & Computer Sciences
Massachusetts Institute of Technology, USA
http://web.mit.edu/pinto
Yes, stackless is compatible. However I don't think they lifted the
GIL issue. So no performance improvement there.
Your other option would be to code an extension module that releases
the GIL and does I/O in the background. It shouldn't be too hard.
--
La brigade SnW veut vous recruter - http://www.brigadesnw.com
> I'm loading them by passing the file object to cPickle.
I know that this is a really old thread, and you may have already
found a good solution, but just in case, I thought I'd share something
I just discovered: Python's gzip file wrapper *really* slows things
down.
Of course, without the gzip wrapper, files are much larger. However,
courtesy of a recent email from David Warde-Farley about carray, I've
been playing with another alternative, blosc
(https://github.com/FrancescAlted/python-blosc).
I've been experimenting with data compression option on the mnist
pickle file used with the tutorials. I've tried three things so far:
(1) Use the gzipped pickle file exactly as it comes, with gzip.open.
(2) gunzip the pickle file, and just use plain open().
(3) Use blosc.pack_array on each numpy array in the file, and then
pickle the results. When loading the file, use blosc.unpack_array to
restore each numpy array.
Summary:
Method File size Total file load time (including all decompression)
gzip 220.0 Mb 6.76s
open 16.2 Mb 0.52s
blosc 26.4 Mb 0.87s
So it looks like blosc might actually offer a nice middle ground, in
terms of keep file sizes small but still offering fast read times.
Of course, if you're not currently using the gzip wrapper, this
doesn't help much...
I found using blosc to be quite straightforward, but I can share the
crude code I cobbled together to test this, if it would be of any use.
-josh
I did indeed! I transposed the file sizes for gzip/open. Good catch.
Fixed version:
Method File size Total file load time (including all decompression)
open 220.0 Mb 0.52s
gzip 16.2 Mb 6.76s
blosc 26.4 Mb 0.87s
It looks like future versions of carray will avoid the need for
manually managing blosc compression and will make it a bit easier --
see http://groups.google.com/group/carray/browse_thread/thread/1aadced6eefb359.
When 0.4 comes out, I plan to revisit. Adding theano support for
carrays (if only via triggering automatic exporting of slices to numpy
arrays) could make this almost entirely transparent.
-josh
(This is in keeping with the straightforward open case -- 0.52s to
read 220Mb matches pretty closely to 0.06s to read 26.4Mb.)
-josh
On Mon, Jan 3, 2011 at 4:00 PM, Josh Bleecher Snyder
Interesting. Did you tried the transpose trick with gzip too or only
with carray? It can help me make the file smaller and make the
decompression time smaller too.
Don't forget that the compression ration of carray vs gzip will
change. carray is specialized when the entropy is low in the data, not
gzip. So you must do the file size comparison for each dataset that
you will use. Also, for use in a cluster with only 1 file server to
server ~350 jobs running at the same time, the file size is more
important then decompression time. So people in our lab, don't use it!
We need a better way to deal in Theano with dataset that don't fit in
memory. We also need a way to handle direction in theano generating
output that don't fit into memory.
For this I plan to test at the end of this week and next week
something that will use pytables[1]. It is from the same author as
carray and also allow to use the same compression algo as well as gzip
and lzo. It also handle the case when not all data fit in memory.
I think I read on some paper on pytables that tell that lzo give
approximately the same file size as gzip, but is faster at
decompression. Pytables use gzip by default as it is installed by
default, but not lzo.
Fred
p.s. I will try to remember the trick to transpose the input. I think
it will be applicable to many algo.
On Mon, Jan 3, 2011 at 7:03 PM, Josh Bleecher Snyder
Actually, for the numbers I gave in this email thread, I didn't do any
transposition at all. (Sorry for any confusion. When I said in
response to James that I transposed the numbers, I meant the numbers
in the final results table, not the actual datasets.)
> Don't forget that the compression ration of carray vs gzip will
> change. carray is specialized when the entropy is low in the data, not
> gzip. So you must do the file size comparison for each dataset that
> you will use.
There will of course be variation per dataset, although gzip also
won't work well on data with high entropy (almost by definition). And
actually, the way that blosc's pack_array works (which is what I was
using), is by pickling the numpy array and then compressing the
resulting string. So it is actually treating the ndarray as an opaque
string, much like gzip. (Interestingly, the pickling/unpickling
accounts for the vast majority of blosc's pack_array and unpack_array
run time.) So while I agree that it is definitely worth experimenting
with each new data set, I think blosc has decent odds of performing
well across the board, at least as compared with gzip.
> We need a better way to deal in Theano with dataset that don't fit in
> memory. We also need a way to handle direction in theano generating
> output that don't fit into memory.
>
> For this I plan to test at the end of this week and next week
> something that will use pytables[1]. It is from the same author as
> carray and also allow to use the same compression algo as well as gzip
> and lzo. It also handle the case when not all data fit in memory.
That'd definitely be handy for me. Part of the reason that I've been
poking around at all these is that my dataset will soon not fit in
host memory either, so I'm looking at either keeping it compressed in
memory or reading it in chunks from the filesystem as needed. I look
forward to seeing what you come up with (some form of transparent
compression and decompression, I presume?), particularly as it strikes
me as being a hard problem to solve generally. Please do let me know
if I can be of assistance on this front.
I don't plan to enable directly in theano data compressed in memory,
but I think it is possible to do so with pytables :) It is to keep all
those complicated stuff outside Theano as long as possible that I will
try PyTables first:)
I will keep the list updated when I have something working.
Fred