Hi All-
I have a net that requires some pretty complex data augmentation / data selection / data preparation before being fed into the net. Being far more capable with python than C/C++, I've implemented it in python. What I'm thinking now is, in parallel, fetch the next batch of data to feed to Caffe while Caffe runs the current data through the net. This will all be done via the pycaffe interface. Since I'm going to be training it in this manner, I have a few questions.
(1) How much overhead do people expect shuttling the data disk -> python -> GPU will introduce?
(2) Since the net won't actually have a data layer, does this mean I will be effectively training on a deploy network? Does it make sense to produce a train_val.prototxt?
(3) Will caffe still attempt to test the net while I'm calling the forward and backward passes? I'm leaning towards no, since it won't actually have a solver for a train_val network, but this is still unclear to me.
(4) Does it make more sense to attempt to engineer a pythonlayer as the datasource?
(5) Is it possible to omit a test phase altogether?
Thanks!