Error create_imagenet.sh

3,656 views
Skip to first unread message

Antonio Paes

unread,
Apr 23, 2015, 2:06:19 PM4/23/15
to caffe...@googlegroups.com
Hi guys,

I'm this error when I run create_imagenet.sh:


Creating train lmdb...
I0423 15:03:36.275789 24793 convert_imageset.cpp:79] Shuffling data
E0423 15:03:36.276610 24793 common.cpp:93] Cannot create Cublas handle. Cublas won't be available.
E0423 15:03:36.277169 24793 common.cpp:100] Cannot create Curand generator. Curand won't be available.
I0423 15:03:36.277463 24793 convert_imageset.cpp:82] A total of 0 images.
F0423 15:03:36.277499 24793 db.cpp:27] Check failed: mkdir(source.c_str(), 0744) == 0 (-1 vs. 0) mkdir examples/imagenet/ilsvrc12_train_lmdbfailed
*** Check failure stack trace: ***
    @     0x7f42b6cf0f6d  google::LogMessage::Fail()
    @     0x7f42b6cf2f23  google::LogMessage::SendToLog()
    @     0x7f42b6cf0ae9  google::LogMessage::Flush()
    @     0x7f42b6cf394e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f42bcad867e  caffe::db::LMDB::Open()
    @           0x403603  main
    @     0x7f42b27df800  __libc_start_main
    @           0x404419  _start
./examples/imagenet/create_imagenet.sh: line 45: 24793 Aborted                 (core dumped) GLOG_logtostderr=1 $TOOLS/convert_imageset --resize_height=$RESIZE_HEIGHT --resize_width=$RESIZE_WIDTH --shuffle $TRAIN_DATA_ROOT $DATA/train.txt $EXAMPLE/ilsvrc12_train_lmdb

anybody help-me?

Thanks.

codingTornado

unread,
Apr 23, 2015, 3:40:55 PM4/23/15
to caffe...@googlegroups.com
This looks like the script is attempting to create a folder it's not allowed to. This is likely to be a problem with ilsvrc12_train_lmdb, which should be emptied before running the script... that is my best guess since this always fixed it for me.

Antonio Paes

unread,
Apr 24, 2015, 1:21:31 PM4/24/15
to caffe...@googlegroups.com
Thanks coding, I deleted the files and ran again and it worked. But now when I run the command to start the training I got this error:

I0424 14:10:13.356081   478 caffe.cpp:113] Use GPU with device ID 0
F0424 14:10:13.356964   478 common.cpp:131] Check failed: error == cudaSuccess (35 vs. 0)  CUDA driver version is insufficient for CUDA runtime version
*** Check failure stack trace: ***
    @     0x7f9913702f6d  google::LogMessage::Fail()
    @     0x7f9913704f23  google::LogMessage::SendToLog()
    @     0x7f9913702ae9  google::LogMessage::Flush()
    @     0x7f991370594e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f99194b8b72  caffe::Caffe::SetDevice()
    @           0x407f63  train()
    @           0x405967  main
    @     0x7f990f1f1800  __libc_start_main
    @           0x405f89  _start
Aborted (core dumped)

already see some like this?

thanks.

Steven

unread,
Apr 24, 2015, 2:47:29 PM4/24/15
to caffe...@googlegroups.com
I suspect  "CUDA driver version is insufficient for CUDA runtime version" is the key error. What version of CUDA driver & runtime are you using? Have you thought about updating your CUDA driver?

Antonio Paes

unread,
Apr 25, 2015, 2:45:14 PM4/25/15
to caffe...@googlegroups.com
it was version 4, but how I'm doing tests in server I ask to manager of server to update the drivers of nvidia and cuda. For now I'm waiting.


Thanks codin, if i have more problems I report. srrsrsr

Antonio Paes

unread,
Apr 25, 2015, 7:03:24 PM4/25/15
to caffe...@googlegroups.com
Hey Steven I update driver for version 7.0, but the error persist, any idea?

Steven

unread,
Apr 25, 2015, 9:00:20 PM4/25/15
to caffe...@googlegroups.com
What is the new (or, at least, current) error?

Antonio Paes

unread,
Apr 25, 2015, 9:06:34 PM4/25/15
to caffe...@googlegroups.com
the current is version 7.0

Steven

unread,
Apr 25, 2015, 11:16:51 PM4/25/15
to caffe...@googlegroups.com
But what is the *actual* error message you get?

Antonio Paes

unread,
Apr 26, 2015, 12:33:51 AM4/26/15
to caffe...@googlegroups.com
Hi Steven, I solved the problem by updating the cuda drives, but now after creating .mdb files and run the make_image_mean to create that binary.proto file, when I run the command to train the network, the prompt returns me this error:

I0426 01:28:49.218237 30454 net.cpp:84] Creating Layer conv3
I0426 01:28:49.218240 30454 net.cpp:380] conv3 <- norm2
I0426 01:28:49.218245 30454 net.cpp:338] conv3 -> conv3
I0426 01:28:49.218250 30454 net.cpp:113] Setting up conv3
F0426 01:28:49.220103 30457 data_transformer.cpp:138] Check failed: height <= datum_height (227 vs. 61) 
*** Check failure stack trace: ***
    @     0x7fbd61242f6d  google::LogMessage::Fail()
    @     0x7fbd61244f23  google::LogMessage::SendToLog()
    @     0x7fbd61242ae9  google::LogMessage::Flush()
    @     0x7fbd6124594e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fbd67045c1b  caffe::DataTransformer<>::Transform()
    @     0x7fbd670956bc  caffe::DataLayer<>::InternalThreadEntry()
    @     0x7fbd5edb051a  (unknown)
    @     0x7fbd5d0bb374  start_thread
    @     0x7fbd5cdf927d  __clone

Steven

unread,
Apr 26, 2015, 1:13:40 AM4/26/15
to caffe...@googlegroups.com
Again, let the error message guide you.... I think the key message is:


F0426 01:28:49.220103 30457 data_transformer.cpp:138] Check failed: height <= datum_height (227 vs. 61)
 
I believe that imagenet expects to crop images to 227x227 (from 256x256). It looks as though, for some reason, that imagenet is finding a 61x?? (or ??x61) image. Is there any reason to believe that may have happened to you? If I look at the data_transformer.cpp code, I see:

template<typename Dtype>
void DataTransformer<Dtype>::Transform(const Datum& datum,
                                       Blob<Dtype>* transformed_blob) {
  const int datum_channels = datum.channels();
  const int datum_height = datum.height();
  const int datum_width = datum.width();

  const int channels = transformed_blob->channels();
  const int height = transformed_blob->height();
  const int width = transformed_blob->width();
  const int num = transformed_blob->num();

  CHECK_EQ(channels, datum_channels);
  CHECK_LE(height, datum_height); <---- Line 138
  CHECK_LE(width, datum_width);
  CHECK_GE(num, 1);

So it looks as though there's a discrepancy between your transformed blob and the data. I'm afraid I can't offer much more help, since I'm also new to caffe and have not trained an imagenet yet. I've only played with one that's already trained.....

Antonio Paes

unread,
Apr 26, 2015, 1:19:31 AM4/26/15
to caffe...@googlegroups.com
It really should be this, I am using 54 x 61 images, you think if I suit the files to my images have a chance to succeed?

And thank you very much for help!

Antonio Paes

unread,
Apr 26, 2015, 1:21:57 AM4/26/15
to caffe...@googlegroups.com
It really should be this, I am using 54 x 61 images, you think if I suit the files to my images have a chance to succeed?

Thank you for now Steven.


Em domingo, 26 de abril de 2015 02:13:40 UTC-3, Steven escreveu:

Steven

unread,
Apr 26, 2015, 1:29:03 AM4/26/15
to caffe...@googlegroups.com
Yes, I think that may be it. If you look at   caffe/models/bvlc_reference_caffenet/train_val.prototxt  , you can see that it's expecting 227x227 images:

name: "CaffeNet"
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 227
    mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
  }
# mean pixel / channel-wise mean instead of mean image
#  transform_param {
#    crop_size: 227
#    mean_value: 104
#    mean_value: 117
#    mean_value: 123
#    mirror: true
#  }
  data_param {
    source: "examples/imagenet/ilsvrc12_train_lmdb"
    batch_size: 256
    backend: LMDB
  }
}
layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mirror: false
    crop_size: 227
    mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
  }
# mean pixel / channel-wise mean instead of mean image
#  transform_param {
#    crop_size: 227
#    mean_value: 104
#    mean_value: 117
#    mean_value: 123
#    mirror: true
#  }
  data_param {
    source: "examples/imagenet/ilsvrc12_val_lmdb"
    batch_size: 50
    backend: LMDB

Antonio Paes

unread,
Apr 26, 2015, 1:39:05 AM4/26/15
to caffe...@googlegroups.com

Actually, you're right, man I sleep, I am here in Brazil are already 2:36 AM, tomorrow I continue, as I can help you too can ask guy, I'm new in caffe, but with effort we will tame this monster. ..kkkkkkk

Thank you for your help.

Antonio Paes

unread,
Apr 27, 2015, 9:46:18 AM4/27/15
to caffe...@googlegroups.com
Hey Steven, I'm editing the file train_val.prototxt, and I'm with some doubts. For example I'm using images with 54 x 61, I have change values of batch_size and Kernel_size ? 

and what is batch_size ?

Steven

unread,
Apr 28, 2015, 1:01:29 AM4/28/15
to caffe...@googlegroups.com
Batch_size, I believe, is the number of images you train on at any given time (i.e. the number of images in the 'batch'). I'm fairly certain you would need to modify the kernel size, since the kernel is the convolutional kernel, so it helps determine the size of the output for a given layer. Since your input is not of the size the net was designed for, there will probably be problems.

I think, rather than modifying the net, why not scale your images up to 256x256? Then, when you have things working and a better understanding of convolutional nets, think about modifying an existing architecture.

Antonio Paes

unread,
Apr 28, 2015, 1:18:18 PM4/28/15
to caffe...@googlegroups.com
Exactly, I'm thinking about this, I'm running a initial test with default parameters of caffe. After do this, I'm think in modify these parameters.

Antonio Paes

unread,
Apr 28, 2015, 10:48:45 PM4/28/15
to caffe...@googlegroups.com
Hi Steven, I update parameter and start train the model, everithing was fine but when the network finish train, I got this erro:

I0428 23:45:21.497864 27543 solver.cpp:334] Snapshotting to models/bvlc_reference_caffenet/caffenet_train_iter_50000.caffemodel
I0428 23:45:21.555531 27543 solver.cpp:342] Snapshotting solver state to models/bvlc_reference_caffenet/caffenet_train_iter_50000.solverstate
I0428 23:45:21.649721 27543 solver.cpp:248] Iteration 50000, loss = 1.95654
I0428 23:45:21.649745 27543 solver.cpp:266] Iteration 50000, Testing net (#0)
I0428 23:45:35.369022 27543 solver.cpp:315]     Test net output #0: accuracy = 0
I0428 23:45:35.369060 27543 solver.cpp:315]     Test net output #1: loss = 3.23151 (* 1 = 3.23151 loss)
I0428 23:45:35.369065 27543 solver.cpp:253] Optimization Done.
I0428 23:45:35.369070 27543 caffe.cpp:134] Optimization Done.
*** Aborted at 1430275535 (unix time) try "date -d @1430275535" if you are using GNU date ***
PC: @     0x7ff20c7f2150 __lll_unlock_elision
*** SIGSEGV (@0x0) received by PID 27543 (TID 0x7ff216e4b980) from PID 0; stack trace: ***
    @     0x7ff20c7f0740 (unknown)
    @     0x7ff20c7f2150 __lll_unlock_elision
    @     0x7ff1f9ea212c (unknown)
    @     0x7ff1f9e348c2 (unknown)
Segmentation fault (core dumped)

any idea?

thanks.

MOHANA LAHARI Tanguturi

unread,
May 26, 2017, 2:22:41 AM5/26/17
to Caffe Users
Could you tell me which files did you delete?
Reply all
Reply to author
Forward
0 new messages