MLP and Softmax Caffe Implementation Data Layer Problems

İlker Kesen

ungelesen,

27.09.2016, 12:47:5627.09.16

an caffe...@googlegroups.com

Hi all,

I am a Caffe newcomer and I want to implement traditional MNIST Softmax regression (without any hidden layers) and MLP examples in Caffe. Writing a net architecture is quite easy, but what I am troubling is the data layer. As far as I've understand, input must be 4 dimensional and since I'm working on a non-grid data, dimensions should be [batchsize x 1 x 1 x 784] for MNIST. HDF5 data layer recommend for those operations but when I use it, execution time grows up exponentially (official Lenet with LMDB example completes in ~3.38 sec, softmax takes ~15 sec!).

Therefore, I decided to switch LMDB backend for data layer. LMDB backend gives speed that I desire, but with this my accuracy does not improve as much as I expected. 37% is really low, because I obtained 97% accuracy with HDF5.

When I run my training script I get a warning, it indicates "[ ... blocking_queue.cpp:50] Data layer prefetch queue empty". If I am not wrong, I have read that its because minibatching operation cannot catch up with training operation (it doesn't make sense to me).

My question is general. What is the best (or easiest) way to deal with 1-dimensional data in Caffe? I followed this blog post [0] and write a Python script which takes pickle MNIST data used in Theano tutorial and generates LMDB MNIST data with dimensions (total_sample_count, 1, 1, 784). Maybe I'm doing something wrong, but I couldn't have figured it out yet. If so, where is the mistake?

All necessary files included in the gist [1] that I'm sharing.

[0] http://deepdish.io/2015/04/28/creating-lmdb-in-python/

[1] https://gist.github.com/ilkerkesen/43dfad5d4b173c5d4df6c7c998b8907b

--

Ilker Kesen
+90 506 342 24 22

İlker Kesen

ungelesen,

28.09.2016, 08:20:1928.09.16

an caffe...@googlegroups.com

Okay, now it does make sense to me that why Caffe gives "prefetch queue empty" message as a warning, because GPU memory allocation is more costly than train. I think if I use memory data layer, I can transfer all my data to GPU and us it from there with no delay, am I right? I guess I am making a mistake in converting MNIST data from pickle format to LMDB format script, but I couldn't have find it. Anyway, you can ignore this question, but I will be very grateful if you imply my mistake on LMDB data generation.

Thanks.

khanbaba

ungelesen,

28.09.2016, 16:41:1228.09.16

an Caffe Users

Just a quick answer for your prefetch queue empty, When you system is busy, and it takes more time to load data in GPU, than it gives this message. its not a big problem. Close other things on your computer. than you willl not receive this message.

Allen antworten

Antwort an Autor

Weiterleiten