cifar-10, mnist examples data?

254 views
Skip to first unread message

Christopher Turnbull

unread,
Feb 24, 2016, 8:46:08 AM2/24/16
to Caffe Users
When looking at the scripts for downloading these two datasets on caffes website, what is actually going on?

I see the end product is an lmdb but could I actually view the ACTUAL images? How would I do this?

And from what I see it's just a .bin script that transforms the data, but can I have a look at what is actually going on here?

As in, if I had my own data set e.g. family pictures with names, and I wanted to train a network, how would I go about feeding in the data or converting it?

Jan C Peters

unread,
Feb 24, 2016, 9:44:00 AM2/24/16
to Caffe Users
(1) What exact scripts are you referring to? For everything that caffe does there is source code in its repository, there is always a file where you can go to see what is actually going on.

(2) For mnist (and I suppose cifar too) the images are stored in a binary format that is not directly compatible with any other format like BMP, JPG, whatever. The reason is to not lose data through compression as well as save space on not duplicating metadata. Since these are directly converted to caffe "Datum" structures residing in an LMDB, there is no direct way to actually view them. The easiest way would be to write an extractor/viewer that directly pulls the queried image from the LMDB and displays it (or saves it in some common image format). This can be done in about 20 lines of python code. Not too difficult.

(3) If you convert our own data, format them correclty and use the provided tools (look at the convert_imageset utility), or write your own code to convert them to LMDB. Look at convert_imageset to get an idea of how that works.

Jan

Christopher Turnbull

unread,
Feb 25, 2016, 5:00:18 AM2/25/16
to Caffe Users
(1) examples/mnist/create_mnist.sh

so I had a look at this .sh file, and it looks like it's basically using

/build/examples/mnist/convert_mnist_data.bin

but I've really no idea what the bin file is!

(2) Ah okay. What is the 'datum' structure exactly, though?

(3) So there's no examples in caffe of using real life images- what is the convert_imageset.cpp thing used for? I'm trying to read it but holy cow it looks complicated...

Jan C Peters

unread,
Feb 26, 2016, 7:39:59 AM2/26/16
to Caffe Users

Well, of course you need to look into the sources, which are not in the build dir.

(1) https://github.com/BVLC/caffe/blob/master/examples/mnist/convert_mnist_data.cpp

(2) https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto#L30

(3) Well, the images from the imagenet challenge are pretty close to "real life images" I'd say. The convert_imageset basically just takes a folder of image files and converts them to an LMDB or LevelDB database to use as input for caffe iirc.

Jan
Reply all
Reply to author
Forward
0 new messages