I'm looking at Torch7 as an option for several ML text tasks. Is it possible to lazy load large text files like >20GB for training on a GPU with Torch? I understand that I'll have to go through the OS file system, but is this something commonly done? How would I go about doing it in Torch?I don't have much experience with GPU's either.
--
You received this message because you are subscribed to the Google Groups "torch7" group.
To unsubscribe from this group and stop receiving emails from it, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at http://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.
By lazy-load, do you mean to load parts of the text files (indexed or at random)? You can do that, using standard lua file IO.If you use the high-level cutorch/cunn interface, you dont need to have experience with GPUs, it's all taken care of for you.
If we have to hit the disk, we usually do data-loading on separate threads using the threads package (https://github.com/torch/threads-ffi). An example is here: https://github.com/soumith/imagenet-multiGPU.torch