Network learns but applying net.predict() to cases yields random results in Python

Raffael

未讀,

2015年5月6日清晨6:23:562015/5/6

收件者：caffe...@googlegroups.com

Hello!

I trained a network on a set of points (x,y) - classifying for y<=x (label = 0) or y>x (label = 1).

According to the output of `caffe train` the network learns this pattern - the accuracy starts at 19% and increases over 98% to 100%.

But I cannot successfully predict points afterwards - the predictions seem to be more or less random.

x = np.array([[[[0.5]],[[-0.5]]]], dtype=float32)
res = net.predict(x, oversample=False)

`res` afterwards is right about 50% of the time.

The full set up you will find in this IPython Notebook:

http://nbviewer.ipython.org/github/joyofdata/joyofdata-articles/blob/master/deeplearning-with-caffe/caffe.ipynb#Test-Model

Looking forward to your guidance

Kind Regards

Raffael

Yoann

未讀,

2015年5月6日上午8:57:432015/5/6

收件者：caffe...@googlegroups.com

Hi,

Don't use classifier() and predict() functions which are made for pictures but you can look at classifier.py and use it as an example to write your own functions.

To initialize the net: "net = caffe.Net(MODEL_FILE, PRETRAINED_MODEL, caffe.TEST)" is enough.

To predict the values, in my case I preprocess my own data and use the forward function to predict one value at a time:

out = NET.forward(**{NET.inputs[0]: np.asarray([im_proc])})

with im_proc the preprocessed data to be sent to the model (it needs to match the shape of the input data defined in your deploy.prototxt) and out the outputs of the model for this data.

Be careful in your deploy.prototxt. My input data in my deploy.prototxt looks like:

name: "nameofthemodel"

input: "data"

input_dim: 1

input_dim: 3

input_dim: 227

In your case it will probably be:

name: "nameofthemodel"

input: "data"

input_dim: 1

input_dim: 2

input_dim: 1

Best,

Yoann

Boaz

未讀,

2015年5月6日上午9:47:172015/5/6

收件者：caffe...@googlegroups.com

Looking at your training results, everything looks fine to me. So I guess it's something with the classification test code...

Are the ReLU layers from your model_prod.prototxt matching with model.prototxt? (maybe you can post the model.prototxt too)

訊息已遭刪除

Raffael

未讀,

2015年5月6日上午10:06:422015/5/6

收件者：caffe...@googlegroups.com

Hey Boaz,

sorry, had to delete the previous pasteings - those were previous versions.

model_prod.prototxt:

name: "SimpleNet"


input: "data"
input_dim: 1
input_dim: 2
input_dim: 1
input_dim: 1


layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "ip2"
  top: "prob"
}

model.prototxt:

name: "SimpleNet"
layer {
  name: "simple"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  data_param {
    source: "train_data_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "simple"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  data_param {
    source: "test_data_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

Boaz

未讀,

2015年5月6日上午10:28:182015/5/6

收件者：caffe...@googlegroups.com

Looks also fine.

Btw. not a python pro, but what is happening here?

def load_csv_into_lmdb(csv_name, lmdb_name):
    df = pd.read_csv(csv_name)
    y = df.ix[:,0].as_matrix()
    x = df.ix[:,1:].as_matrix()
    x = x[:,:,None,None]

Aren't you forgetting the labels at df.ix[:,2:] ?

I also don't see how you're adding both x and y to the datum float_data field. And then for the label your using datum.label = int(y[i]) ?

Hope I'm not being too much of a Python noob that I'm missing something... ;)

Raffael

未讀,

2015年5月6日上午10:38:132015/5/6

收件者：caffe...@googlegroups.com

Hi Boaz - thanks for your input!

Btw. not a python pro, but what is happening here?

def load_csv_into_lmdb(csv_name, lmdb_name): df = pd.read_csv(csv_name) y = df.ix[:,0].as_matrix() x = df.ix[:,1:].as_matrix() x = x[:,:,None,None]

Aren't you forgetting the labels at df.ix[:,2:] ?

df.ix[:,0] references all rows (:) of the first column (0) - this is the label

df.ix[:,1:] references all rows (:) of all columns starting with the second (1:) - this is the data, i.e. the two coordinates x and y (not to be confused with the variable names here)

I also don't see how you're adding both x and y to the datum float_data field. And then for the label your using datum.label = int(y[i]) ?

The labels are stored in y and element-wise (iteration over cases - one case per Datum) converted to integer.

datum.float_data.extend(x[i].flat)

The previous line adds one case of the data to a Datum (source: https://github.com/BVLC/caffe/blob/master/python/caffe/io.py - array_to_datum()). Previously I did what is commented now - but this led to the network not training anything. It is only intended for integer data.

The confusion probably arose from variables x containing the data (coordinates x and y) and y containing the labels.

So, as far as I can tell there is everything alright. I also manually checked the content from the LMDB databases and the contained the Datums as (by me) expected.

訊息已遭刪除

Raffael

未讀,

2015年5月6日上午10:47:172015/5/6

收件者：caffe...@googlegroups.com

Hi Yoann,

thanks - that looks very promising!

I will give this a try asap.

I assume you mean forward_all() instead of forward() and NET is the 'net' you defined previously.

And yes I defined the input_dims as suggested by you - but thanks for confirming. I don't see why one would have to specify the first input_dim at all for the deployment - also I don't understand why there's a need for two prototxts. Seems overly complicated and a source for confusion and mistakes. As I see it - and I don't see much, yet - both specifications have to be congruent except for defined parts in the beginning and the end.

Kind Regards

Raffael

未讀,

2015年5月6日下午1:11:102015/5/6

收件者：caffe...@googlegroups.com

Yesssssssssss - it f'ing works! Awesome, Yoann, thanks! :)

Yoann

未讀,

2015年5月7日凌晨2:56:142015/5/7

收件者：caffe...@googlegroups.com

Your welcome ;)

Glad it helped.

Best,

Yoann

訊息已遭刪除

Shaunak De

未讀,

2015年7月15日上午11:03:482015/7/15

收件者：caffe...@googlegroups.com

Hey I am having some trouble using the net.forward function.

I realize that the input array has to be initialized using the caffe.io.Transformer function, but I am not able to understand how to go about it. An example would be much appreciated.

Simply calling ' out = net.forward(**{net.inputs[0]:im_proc}) ' caused a IndexError: list index out of range

Shaunak De

未讀,

2015年7月15日中午12:36:102015/7/15

收件者：caffe...@googlegroups.com

Okay, I solved this using this link: http://stackoverflow.com/questions/29124840/prediction-in-caffe-exception-input-blob-arguments-do-not-match-net-inputs
Although, the output seems to contain a sting. Is that what is supposed to happen?

回覆所有人

回覆作者

轉寄