A total of 0 images with leveldb

1,390 views
Skip to first unread message

Ehsan Jahangiri

unread,
Apr 28, 2015, 11:47:27 PM4/28/15
to caffe...@googlegroups.com
Hey guys,

I used the convert_mnist_data.cpp to create leveldb format of the MNIST dataset and I got two .log files for mnist train and test. However, when I tried to train the net using the following input layer in the "lenet_train_test.prototxt" file I got "A total of 0 images" message. I am running in CPU mode. Is there anything wrong with the way that the MNIST leveldb files are generated? Can anyone please send me MNIST in lmdb format if possible? I am working in windows and the  convert_mnist_data.cpp file crashed when generating lmdb format for MNIST.

I would appreciate your help.

Thanks a lot
Ehsan


name: "LeNet"
layers {
  name: "mnist"
  type: IMAGE_DATA
  top: "data"
  top: "label"
  data_param {
    source: "examples/mnist/mnist_train_leveldb"
    batch_size: 64
    backend: LEVELDB
  }
  transform_param {
    scale: 0.00390625
  }
  include: { phase: TRAIN }
}


I0428 23:24:49.050755 11040 layer_factory.hpp:74] Creating layer mnist
I0428 23:24:49.051754 11040 net.cpp:84] Creating Layer mnist
I0428 23:24:49.051754 11040 net.cpp:338] mnist -> data
I0428 23:24:49.052754 11040 net.cpp:338] mnist -> label
I0428 23:24:49.053755 11040 net.cpp:113] Setting up mnist
I0428 23:24:49.053755 11040 image_data_layer.cpp:36] Opening file
I0428 23:24:49.054754 11040 image_data_layer.cpp:51] A total of 0 images

npit

unread,
Apr 29, 2015, 6:11:59 AM4/29/15
to caffe...@googlegroups.com
I too had many of such errors. It's something with the paths, either of the source file, or of the images in the source file.
Double check everything and you'll find the error.

Ehsan Jahangiri

unread,
Apr 29, 2015, 1:44:59 PM4/29/15
to caffe...@googlegroups.com
Thank you for your advice. It was actually because the input training files's format which was wrong and not compatible with the image_data_layer.cpp. Problem is fixed now

npit

unread,
Apr 30, 2015, 3:18:47 AM4/30/15
to caffe...@googlegroups.com
Great. What was the format that was incompatible? You mean not .jpg files?

Ehsan Jahangiri

unread,
Apr 30, 2015, 12:28:26 PM4/30/15
to caffe...@googlegroups.com
I used the following code which is an extended version of "https://github.com/BVLC/caffe/blob/master/examples/mnist/convert_mnist_data.cpp" that generates training and test MNIST in .png format as well as two txt files where each line includes the name of files followed by its label (0-9). I now cannot find where I found this but it had a problem in its original form that I fixed in below (red line). The red line used to be as below:

images_index << filename << " " << label_str << "\n";

that caused the label index in the text files "mnist_train_files_images_index.txt" and "mnist_test_files_images_index.txt" to become strange characters and consequently the file names and their labels couldn't be read (A total of 0 images) from the text files by the following lines in "image_data_layer.cpp"

  while (infile >> filename >> label) {
    lines_.push_back(std::make_pair(filename, label));
  }

I changed "label_str" to "(int)label" and this problem was resolved.


I didn't have any luck with the "lmdb" and "leveldb" format. I am using windows in CPU mode. The "convert_mnist_data.cpp" crashes when I wanted to generate "lmdb" files so I can even get the MNIST data in the format to be read by caffe. I could generate "leveldb" files tho however caffe couldnt read the or find these files and gave "A total of 0 images" message. It would be great if someone could fix "convert_mnist_data.cpp" in the following to generate "lmdb" files for windows as well or comment on why leveldb might not work for me.

Were you able to generate lmdb files? Would you please share your code and your generated files?

Thank you very much

Ehsan

--------------------------------------------------------------------------------------------

// This script converts the MNIST dataset to a lmdb (default) or
// leveldb (--backend=leveldb) format used by caffe to load data.
// Usage:
//    convert_mnist_data [FLAGS] input_image_file input_label_file
//                        output_db_file
// The MNIST dataset could be downloaded at

#include <gflags/gflags.h>
#include <glog/logging.h>
#include <google/protobuf/text_format.h>
#include <leveldb/db.h>
#include <leveldb/write_batch.h>
#include <lmdb.h>
#include <stdint.h>
#include <sys/stat.h>

#include <iostream>

#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/highgui/highgui_c.h>
#include <opencv2/imgproc/imgproc.hpp>

#include <fstream>  // NOLINT(readability/streams)
#include <string>

#include "caffe/proto/caffe.pb.h"

#ifdef _MSC_VER 
#include <direct.h> 
#define snprintf sprintf_s 
#endif

using namespace caffe;  // NOLINT(build/namespaces)
using std::string;

DEFINE_string(backend, "lmdb", "The backend for storing the result");

uint32_t swap_endian(uint32_t val) {
val = ((val << 8) & 0xFF00FF00) | ((val >> 8) & 0xFF00FF);
return (val << 16) | (val >> 16);
}

void convert_dataset(const char* image_filename, const char* label_filename,
const char* db_path, const string& db_backend) {
// Open files
std::ifstream image_file(image_filename, std::ios::in | std::ios::binary);
std::ifstream label_file(label_filename, std::ios::in | std::ios::binary);
CHECK(image_file) << "Unable to open file " << image_filename;
CHECK(label_file) << "Unable to open file " << label_filename;
// Read the magic and the meta data
uint32_t magic;
uint32_t num_items;
uint32_t num_labels;
uint32_t rows;
uint32_t cols;

image_file.read(reinterpret_cast<char*>(&magic), 4);
magic = swap_endian(magic);
CHECK_EQ(magic, 2051) << "Incorrect image file magic.";
label_file.read(reinterpret_cast<char*>(&magic), 4);
magic = swap_endian(magic);
CHECK_EQ(magic, 2049) << "Incorrect label file magic.";
image_file.read(reinterpret_cast<char*>(&num_items), 4);
num_items = swap_endian(num_items);
label_file.read(reinterpret_cast<char*>(&num_labels), 4);
num_labels = swap_endian(num_labels);
CHECK_EQ(num_items, num_labels);
image_file.read(reinterpret_cast<char*>(&rows), 4);
rows = swap_endian(rows);
image_file.read(reinterpret_cast<char*>(&cols), 4);
cols = swap_endian(cols);

// lmdb
MDB_env *mdb_env = NULL;
MDB_dbi mdb_dbi;
MDB_val mdb_key, mdb_data;
MDB_txn *mdb_txn = NULL;
// leveldb
leveldb::DB* db = NULL;
leveldb::Options options;
options.error_if_exists = true;
options.create_if_missing = true;
options.write_buffer_size = 268435456;
leveldb::WriteBatch* batch = NULL;
// image data files
std::ofstream images_index;

// Open db
if (db_backend == "leveldb") {  // leveldb
LOG(INFO) << "Opening leveldb " << db_path;
leveldb::Status status = leveldb::DB::Open(
options, db_path, &db);
CHECK(status.ok()) << "Failed to open leveldb " << db_path
<< ". Is it already existing?";
batch = new leveldb::WriteBatch();
}
else if (db_backend == "lmdb") {  // lmdb
LOG(INFO) << "Opening lmdb " << db_path;

#ifndef _MSC_VER 
CHECK_EQ(mkdir(db_path, 0744), 0) << "mkdir " << db_path << "failed";
#else 
CHECK_EQ(_mkdir(db_path), 0) << "mkdir " << db_path << "failed";
#endif

CHECK_EQ(mdb_env_create(&mdb_env), MDB_SUCCESS) << "mdb_env_create failed";
CHECK_EQ(mdb_env_set_mapsize(mdb_env, 1099511627776), MDB_SUCCESS)  // 1TB
<< "mdb_env_set_mapsize failed";
CHECK_EQ(mdb_env_open(mdb_env, db_path, 0, 0664), MDB_SUCCESS)
<< "mdb_env_open failed";
CHECK_EQ(mdb_txn_begin(mdb_env, NULL, 0, &mdb_txn), MDB_SUCCESS)
<< "mdb_txn_begin failed";
CHECK_EQ(mdb_open(mdb_txn, NULL, 0, &mdb_dbi), MDB_SUCCESS)
<< "mdb_open failed. Does the lmdb already exist? ";
}
else if (db_backend == "files") {

#ifndef _MSC_VER 
CHECK_EQ(mkdir(db_path, 0744), 0) << "mkdir " << db_path << "failed";
#else 
CHECK_EQ(_mkdir(db_path), 0) << "mkdir " << db_path << "failed";
#endif

std::string db_path_str(db_path);
std::string index_filename = db_path_str + "_images_index.txt";
images_index.open(index_filename.c_str());
}
else {
LOG(FATAL) << "Unknown db backend " << db_backend;
}

// Storing to db
char label;
char* pixels = new char[rows * cols];
int count = 0;
const int kMaxKeyLength = 10;
char key_cstr[kMaxKeyLength];
string value;

Datum datum;
datum.set_channels(1);
datum.set_height(rows);
datum.set_width(cols);
LOG(INFO) << "A total of " << num_items << " items.";
LOG(INFO) << "Rows: " << rows << " Cols: " << cols;
for (int item_id = 0; item_id < num_items; ++item_id) {
image_file.read(pixels, rows * cols);
label_file.read(&label, 1);
datum.set_data(pixels, rows*cols);
datum.set_label(label);
snprintf(key_cstr, kMaxKeyLength, "%08d", item_id);
datum.SerializeToString(&value);
string keystr(key_cstr);

// Put in db
if (db_backend == "leveldb") {  // leveldb
batch->Put(keystr, value);
}
else if (db_backend == "lmdb") {  // lmdb
mdb_data.mv_size = value.size();
mdb_data.mv_data = reinterpret_cast<void*>(&value[0]);
mdb_key.mv_size = keystr.size();
mdb_key.mv_data = reinterpret_cast<void*>(&keystr[0]);
CHECK_EQ(mdb_put(mdb_txn, mdb_dbi, &mdb_key, &mdb_data, 0), MDB_SUCCESS)
<< "mdb_put failed";
}
else if (db_backend == "files") {

cv::Mat image_as_mat(cv::Size(rows, cols), CV_8UC1, pixels);

std::vector<int> compression_params;
compression_params.push_back(CV_IMWRITE_PNG_COMPRESSION);
std::string item_id_str = static_cast<std::ostringstream*>(&(std::ostringstream() << item_id))->str();
std::string label_str = static_cast<std::ostringstream*>(&(std::ostringstream() << label))->str();
std::string db_path_str(db_path);

std::string filename = db_path_str + "/" + item_id_str + ".png";  // TODO: not portable

imwrite(filename, image_as_mat, compression_params);
images_index << filename << " " << (int)label << "\n";

}
else {
LOG(FATAL) << "Unknown db backend " << db_backend;
}

if (++count % 1000 == 0) {
// Commit txn
if (db_backend == "leveldb") {  // leveldb
db->Write(leveldb::WriteOptions(), batch);
delete batch;
batch = new leveldb::WriteBatch();
}
else if (db_backend == "lmdb") {  // lmdb
CHECK_EQ(mdb_txn_commit(mdb_txn), MDB_SUCCESS)
<< "mdb_txn_commit failed";
CHECK_EQ(mdb_txn_begin(mdb_env, NULL, 0, &mdb_txn), MDB_SUCCESS)
<< "mdb_txn_begin failed";
}
else if (db_backend == "files") {
// nothing to do here.. 
}
else {
LOG(FATAL) << "Unknown db backend " << db_backend;
}
}
}
// write the last batch
if (count % 1000 != 0) {
if (db_backend == "leveldb") {  // leveldb
db->Write(leveldb::WriteOptions(), batch);
delete batch;
delete db;
}
else if (db_backend == "lmdb") {  // lmdb
CHECK_EQ(mdb_txn_commit(mdb_txn), MDB_SUCCESS) << "mdb_txn_commit failed";
mdb_close(mdb_env, mdb_dbi);
mdb_env_close(mdb_env);
}
else if (db_backend == "files") {
// Nothing To do here..
}
else {
LOG(FATAL) << "Unknown db backend " << db_backend;
}
LOG(ERROR) << "Processed " << count << " files.";
}

// close resources
if (db_backend == "files") {
images_index.close();
}

delete pixels;
}

int main(int argc, char** argv) {
#ifndef GFLAGS_GFLAGS_H_
namespace gflags = google;
#endif

gflags::SetUsageMessage("This script converts the MNIST dataset to\n"
"the lmdb/leveldb format used by Caffe to load data.\n"
"Usage:\n"
"    convert_mnist_data [FLAGS] input_image_file input_label_file "
"output_db_file\n"
"The MNIST dataset could be downloaded at\n"
"You should gunzip them after downloading,"
"or directly use data/mnist/get_mnist.sh\n");
gflags::ParseCommandLineFlags(&argc, &argv, true);

const string& db_backend = FLAGS_backend;

if (argc != 4) {
gflags::ShowUsageWithFlagsRestrict(argv[0],
"examples/mnist/convert_mnist_data");
}
else {
google::InitGoogleLogging(argv[0]);
convert_dataset(argv[1], argv[2], argv[3], db_backend);
}
return 0;
}

--------------------------------------------------------------------------------------------

npit

unread,
May 4, 2015, 2:15:43 AM5/4/15
to caffe...@googlegroups.com
I used the convert_imagenet.cpp as is. If I did any change, it was to make it compilable in windows.
Lmdb does not seem to work in windows, but leveldb does.

Note that while everything is error-free, I have not been able to successfully train anything in windows, just did feature extraction, so I am not 100% certain that it works properly (feature extraction seems to do so though).
I haven't pinpointed yet the source of the training failures, so it may well be the images.

I followed this tutorial to build caffe (and the convert, mean tools) in windows.

Hong Hanh

unread,
Jul 21, 2015, 5:49:49 AM7/21/15
to caffe...@googlegroups.com
Hi npit, I also followed the same tutorial as yours to build Caffe in windows.But i have a problem when using convert_imageset.cpp to convert my images to leveldb format.
I got this message after using convert_imageset.cpp : 

E0721 18:09:59.985489 11148 convert_imageset.cpp:152] Processed 36 files.

And I only got these files in my leveldb folder (no :


In convert_imageset.cpp, i only added these line to make it work:


#ifdef _MSC_VER 

#define snprintf sprintf_s 

#endif


Have you ever met this problem?

npit

unread,
Jul 21, 2015, 6:10:34 AM7/21/15
to caffe...@googlegroups.com
That's what my data folder looks like, containing  about 3K 256x256 images:



Are there indeed 36 images in your input? Can you share the image paths file?


Hong Hanh

unread,
Jul 21, 2015, 11:31:48 AM7/21/15
to caffe...@googlegroups.com

Thanks for your reply.
I tested with only 36 images in my image list file. I have checked the file paths carefully, so i don't think it's the problem.

I don't know why my data folder doesn't contain .sst file as yours. Do i need to change anything else in convert_imageset.cpp?

npit

unread,
Jul 22, 2015, 3:56:37 AM7/22/15
to Caffe Users
I have changed a lot of stuff in my convert_imageset, nothing substantial though.
Can you post the entire output, after running the conversion for the 36 images, and the output after running the conversion after 1001 images (just copying &pasting the 36 image paths in the source file will do)?

npit

unread,
Jul 22, 2015, 5:51:49 AM7/22/15
to Caffe Users
Just tested it. 
The .sst files seem to appear for every 3K images, so your output is ok.
For less than 3K images I get file similar to what you posted.

Hong Hanh

unread,
Jul 22, 2015, 9:03:55 PM7/22/15
to Caffe Users, pittar...@gmail.com
Oh, it's good to hear that missing .sst file is not a problem. I will try with more images. Thank you so much.
Reply all
Reply to author
Forward
0 new messages