Making predictions with pycaffe for non-image data. What I am doing wrong?

392 views
Skip to first unread message

AlexK

unread,
Jun 18, 2015, 2:34:47 PM6/18/15
to caffe...@googlegroups.com
Dear all,

I have spent now more than a month trying to make caffe work. I am able to design a network, train and test a model but I am still not able to apply the model to make predictions on new unlabeled data, so I need your input as I am ready to give up caffe for good.


Here I present an example of logistic regression, which works fine:


name: "LogisticRegressionNet"
layer
{
  name
: "data"
  type
: "HDF5Data"
  top
: "data"
  top
: "label"
  include
{
    phase
: TRAIN
 
}
  hdf5_data_param
{
    source
: "/projects/Caffe_DeepLearning/Exp4/train.txt"
    batch_size
: 10
 
}
}
layer
{
  name
: "data"
  type
: "HDF5Data"
  top
: "data"
  top
: "label"
  include
{
    phase
: TEST
 
}
  hdf5_data_param
{
    source
: "/projects/Caffe_DeepLearning/Exp4/test.txt"
    batch_size
: 10
 
}
}
layer
{
  name
: "fc1"
  type
: "InnerProduct"
  bottom
: "data"
  top
: "fc1"
  param
{
    lr_mult
: 1
    decay_mult
: 1
 
}
  param
{
    lr_mult
: 2
    decay_mult
: 0
 
}
  inner_product_param
{
    num_output
: 2
    weight_filler
{
      type
: "gaussian"
      std
: 0.01
   
}
    bias_filler
{
      type
: "constant"
      value
: 0
   
}
 
}
}
layer
{
  name
: "loss"
  type
: "SoftmaxWithLoss"
  bottom
: "fc1"
  bottom
: "label"
  top
: "loss"
}
layer
{
  name
: "accuracy"
  type
: "Accuracy"
  bottom
: "fc1"
  bottom
: "label"
  top
: "accuracy"
  include
{
    phase
: TEST
 
}
}




The data are generated by sklearn:

def gener_Data():


 X
,y = sklearn.datasets.make_classification(n_samples=10000, n_features=4,n_redundant=0,random_state=69)
 X
, Xt, y,yt= sklearn.cross_validation.train_test_split(X,y)
 
 
print y
 
print yt
 
return X, Xt, y,yt


def write_data(ofile1,ofile2,ofile3,X,Xt,y,yt):
 
with h5py.File(ofile1, 'w') as f:
    f
['data'] = X
    f
['label'] = y.astype(np.float32)
 
with h5py.File(ofile2, 'w') as g:
    g
['data'] = Xt
    g
['label'] = yt.astype(np.float32)
 
with open(ofile3,'w') as k:
 
for j in y:
 k
.write(str(j)+"\n")
 
return


The solver file:

net: "/projects/Caffe_DeepLearning/Exp4/train_val.prototxt"
test_iter: 250
test_interval: 1000
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 5000
display: 1000
max_iter: 10000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "/projects/huanlab/AlexKoutsoukas_Folder/Caffe_DeepLearning/Exp4/train"
solver_mode: CPU


The network is trained just fine:

I0617 17:51:36.617648  3682 layer_factory.hpp:74] Creating layer data
I0617
17:51:36.617660  3682 net.cpp:76] Creating Layer data
I0617
17:51:36.617667  3682 net.cpp:334] data -> data
I0617
17:51:36.617679  3682 net.cpp:334] data -> label
I0617
17:51:36.617688  3682 net.cpp:105] Setting up data
I0617
17:51:36.617693  3682 hdf5_data_layer.cpp:66] Loading list of HDF5 filenames from: /projects/Caffe_DeepLearning/Exp4/test.txt
I0617
17:51:36.617959  3682 hdf5_data_layer.cpp:80] Number of HDF5 files: 1
I0617
17:51:36.619175  3682 net.cpp:112] Top shape: 10 4 1 1 (40)
I0617
17:51:36.619189  3682 net.cpp:112] Top shape: 10 1 1 1 (10)
I0617
17:51:36.619195  3682 layer_factory.hpp:74] Creating layer label_data_1_split
I0617
17:51:36.619205  3682 net.cpp:76] Creating Layer label_data_1_split
I0617
17:51:36.619211  3682 net.cpp:372] label_data_1_split <- label
I0617
17:51:36.619220  3682 net.cpp:334] label_data_1_split -> label_data_1_split_0
I0617
17:51:36.619230  3682 net.cpp:334] label_data_1_split -> label_data_1_split_1
I0617
17:51:36.619240  3682 net.cpp:105] Setting up label_data_1_split
I0617
17:51:36.619246  3682 net.cpp:112] Top shape: 10 1 1 1 (10)
I0617
17:51:36.619251  3682 net.cpp:112] Top shape: 10 1 1 1 (10)
I0617
17:51:36.619256  3682 layer_factory.hpp:74] Creating layer fc1
I0617
17:51:36.619266  3682 net.cpp:76] Creating Layer fc1
I0617
17:51:36.619271  3682 net.cpp:372] fc1 <- data
I0617
17:51:36.619278  3682 net.cpp:334] fc1 -> fc1
I0617
17:51:36.619307  3682 net.cpp:105] Setting up fc1
I0617
17:51:36.619321  3682 net.cpp:112] Top shape: 10 2 1 1 (20)
I0617
17:51:36.619334  3682 layer_factory.hpp:74] Creating layer fc1_fc1_0_split
I0617
17:51:36.619343  3682 net.cpp:76] Creating Layer fc1_fc1_0_split
I0617
17:51:36.619359  3682 net.cpp:372] fc1_fc1_0_split <- fc1
I0617
17:51:36.619367  3682 net.cpp:334] fc1_fc1_0_split -> fc1_fc1_0_split_0
I0617
17:51:36.619376  3682 net.cpp:334] fc1_fc1_0_split -> fc1_fc1_0_split_1
I0617
17:51:36.619385  3682 net.cpp:105] Setting up fc1_fc1_0_split
I0617
17:51:36.619391  3682 net.cpp:112] Top shape: 10 2 1 1 (20)
I0617
17:51:36.619396  3682 net.cpp:112] Top shape: 10 2 1 1 (20)
I0617
17:51:36.619401  3682 layer_factory.hpp:74] Creating layer loss
I0617
17:51:36.619415  3682 net.cpp:76] Creating Layer loss
I0617
17:51:36.619421  3682 net.cpp:372] loss <- fc1_fc1_0_split_0
I0617
17:51:36.619428  3682 net.cpp:372] loss <- label_data_1_split_0
I0617
17:51:36.619436  3682 net.cpp:334] loss -> loss
I0617
17:51:36.619443  3682 net.cpp:105] Setting up loss
I0617
17:51:36.619451  3682 layer_factory.hpp:74] Creating layer loss
I0617
17:51:36.619468  3682 net.cpp:112] Top shape: 1 1 1 1 (1)
I0617
17:51:36.619475  3682 net.cpp:118]     with loss weight 1
I0617
17:51:36.619488  3682 layer_factory.hpp:74] Creating layer accuracy
I0617
17:51:36.619498  3682 net.cpp:76] Creating Layer accuracy
I0617
17:51:36.619504  3682 net.cpp:372] accuracy <- fc1_fc1_0_split_1
I0617
17:51:36.619511  3682 net.cpp:372] accuracy <- label_data_1_split_1
I0617
17:51:36.619518  3682 net.cpp:334] accuracy -> accuracy
I0617
17:51:36.619527  3682 net.cpp:105] Setting up accuracy
I0617
17:51:36.619535  3682 net.cpp:112] Top shape: 1 1 1 1 (1)
I0617
17:51:36.619554  3682 net.cpp:165] accuracy does not need backward computation.
I0617
17:51:36.619561  3682 net.cpp:163] loss needs backward computation.
I0617
17:51:36.619567  3682 net.cpp:163] fc1_fc1_0_split needs backward computation.
I0617
17:51:36.619572  3682 net.cpp:163] fc1 needs backward computation.
I0617
17:51:36.619577  3682 net.cpp:165] label_data_1_split does not need backward computation.
I0617
17:51:36.619583  3682 net.cpp:165] data does not need backward computation.
I0617
17:51:36.619591  3682 net.cpp:201] This network produces output accuracy
I0617
17:51:36.619597  3682 net.cpp:201] This network produces output loss
I0617
17:51:36.619608  3682 net.cpp:446] Collecting Learning Rate and Weight Decay.
I0617
17:51:36.619617  3682 net.cpp:213] Network initialization done.
I0617
17:51:36.619622  3682 net.cpp:214] Memory required for data: 528
I0617
17:51:36.619647  3682 solver.cpp:42] Solver scaffolding done.
I0617
17:51:36.619665  3682 solver.cpp:222] Solving LogisticRegressionNet
I0617
17:51:36.619671  3682 solver.cpp:223] Learning Rate Policy: step
I0617
17:51:36.619678  3682 solver.cpp:266] Iteration 0, Testing net (#0)
I0617
17:51:36.711675  3682 solver.cpp:315]     Test net output #0: accuracy = 0.834
I0617
17:51:36.711699  3682 solver.cpp:315]     Test net output #1: loss = 0.682673 (* 1 = 0.682673 loss)
I0617
17:51:36.712317  3682 solver.cpp:189] Iteration 0, loss = 0.684064
I0617
17:51:36.712339  3682 solver.cpp:204]     Train net output #0: loss = 0.684064 (* 1 = 0.684064 loss)
I0617
17:51:36.712354  3682 solver.cpp:470] Iteration 0, lr = 0.01
I0617
17:51:37.176923  3682 solver.cpp:266] Iteration 1000, Testing net (#0)
I0617
17:51:37.268601  3682 solver.cpp:315]     Test net output #0: accuracy = 0.9784
I0617
17:51:37.268623  3682 solver.cpp:315]     Test net output #1: loss = 0.0871707 (* 1 = 0.0871707 loss)
I0617
17:51:37.269023  3682 solver.cpp:189] Iteration 1000, loss = 0.0494762
I0617
17:51:37.269042  3682 solver.cpp:204]     Train net output #0: loss = 0.0494761 (* 1 = 0.0494761 loss)
I0617
17:51:37.269050  3682 solver.cpp:470] Iteration 1000, lr = 0.01
I0617
17:51:37.733898  3682 solver.cpp:266] Iteration 2000, Testing net (#0)
I0617
17:51:37.825458  3682 solver.cpp:315]     Test net output #0: accuracy = 0.977199
I0617
17:51:37.825479  3682 solver.cpp:315]     Test net output #1: loss = 0.0866363 (* 1 = 0.0866363 loss)
I0617
17:51:37.825940  3682 solver.cpp:189] Iteration 2000, loss = 0.242001
I0617
17:51:37.825958  3682 solver.cpp:204]     Train net output #0: loss = 0.242001 (* 1 = 0.242001 loss)
I0617
17:51:37.825968  3682 solver.cpp:470] Iteration 2000, lr = 0.01
I0617
17:51:38.290102  3682 solver.cpp:266] Iteration 3000, Testing net (#0)
I0617
17:51:38.382241  3682 solver.cpp:315]     Test net output #0: accuracy = 0.9776
I0617
17:51:38.382262  3682 solver.cpp:315]     Test net output #1: loss = 0.0871053 (* 1 = 0.0871053 loss)
I0617
17:51:38.382741  3682 solver.cpp:189] Iteration 3000, loss = 0.0756628
I0617
17:51:38.382761  3682 solver.cpp:204]     Train net output #0: loss = 0.0756624 (* 1 = 0.0756624 loss)
I0617
17:51:38.382771  3682 solver.cpp:470] Iteration 3000, lr = 0.01
I0617
17:51:38.846882  3682 solver.cpp:266] Iteration 4000, Testing net (#0)
I0617
17:51:38.938446  3682 solver.cpp:315]     Test net output #0: accuracy = 0.9784
I0617
17:51:38.938467  3682 solver.cpp:315]     Test net output #1: loss = 0.0869164 (* 1 = 0.0869164 loss)
I0617
17:51:38.938899  3682 solver.cpp:189] Iteration 4000, loss = 0.0444386
I0617
17:51:38.938916  3682 solver.cpp:204]     Train net output #0: loss = 0.044438 (* 1 = 0.044438 loss)
I0617
17:51:38.938925  3682 solver.cpp:470] Iteration 4000, lr = 0.01
I0617
17:51:39.403707  3682 solver.cpp:266] Iteration 5000, Testing net (#0)
I0617
17:51:39.495306  3682 solver.cpp:315]     Test net output #0: accuracy = 0.977199
I0617
17:51:39.495326  3682 solver.cpp:315]     Test net output #1: loss = 0.0867109 (* 1 = 0.0867109 loss)
I0617
17:51:39.495731  3682 solver.cpp:189] Iteration 5000, loss = 0.244176
I0617
17:51:39.495749  3682 solver.cpp:204]     Train net output #0: loss = 0.244175 (* 1 = 0.244175 loss)
I0617
17:51:39.495757  3682 solver.cpp:470] Iteration 5000, lr = 0.001
I0617
17:51:39.959995  3682 solver.cpp:266] Iteration 6000, Testing net (#0)
I0617
17:51:40.051751  3682 solver.cpp:315]     Test net output #0: accuracy = 0.9776
I0617
17:51:40.051774  3682 solver.cpp:315]     Test net output #1: loss = 0.0865504 (* 1 = 0.0865504 loss)
I0617
17:51:40.052196  3682 solver.cpp:189] Iteration 6000, loss = 0.0786539
I0617
17:51:40.052214  3682 solver.cpp:204]     Train net output #0: loss = 0.0786534 (* 1 = 0.0786534 loss)
I0617
17:51:40.052223  3682 solver.cpp:470] Iteration 6000, lr = 0.001
I0617
17:51:40.517168  3682 solver.cpp:266] Iteration 7000, Testing net (#0)
I0617
17:51:40.608708  3682 solver.cpp:315]     Test net output #0: accuracy = 0.9776
I0617
17:51:40.608729  3682 solver.cpp:315]     Test net output #1: loss = 0.0865185 (* 1 = 0.0865185 loss)
I0617
17:51:40.609164  3682 solver.cpp:189] Iteration 7000, loss = 0.0382315
I0617
17:51:40.609184  3682 solver.cpp:204]     Train net output #0: loss = 0.0382309 (* 1 = 0.0382309 loss)
I0617
17:51:40.609192  3682 solver.cpp:470] Iteration 7000, lr = 0.001
I0617
17:51:41.073468  3682 solver.cpp:266] Iteration 8000, Testing net (#0)
I0617
17:51:41.165096  3682 solver.cpp:315]     Test net output #0: accuracy = 0.9776
I0617
17:51:41.165117  3682 solver.cpp:315]     Test net output #1: loss = 0.0865795 (* 1 = 0.0865795 loss)
I0617
17:51:41.165603  3682 solver.cpp:189] Iteration 8000, loss = 0.231445
I0617
17:51:41.165623  3682 solver.cpp:204]     Train net output #0: loss = 0.231444 (* 1 = 0.231444 loss)
I0617
17:51:41.165632  3682 solver.cpp:470] Iteration 8000, lr = 0.001
I0617
17:51:41.630273  3682 solver.cpp:266] Iteration 9000, Testing net (#0)
I0617
17:51:41.723743  3682 solver.cpp:315]     Test net output #0: accuracy = 0.9776
I0617
17:51:41.723798  3682 solver.cpp:315]     Test net output #1: loss = 0.0865972 (* 1 = 0.0865972 loss)
I0617
17:51:41.724253  3682 solver.cpp:189] Iteration 9000, loss = 0.0765986
I0617
17:51:41.724274  3682 solver.cpp:204]     Train net output #0: loss = 0.0765984 (* 1 = 0.0765984 loss)
I0617
17:51:41.724287  3682 solver.cpp:470] Iteration 9000, lr = 0.001
I0617
17:51:42.174888  3682 solver.cpp:334] Snapshotting to /projects/Caffe_DeepLearning/Exp4/train_iter_10000.caffemodel
I0617
17:51:42.176039  3682 solver.cpp:342] Snapshotting solver state to /projects/Caffe_DeepLearning/Exp4/train_iter_10000.solverstate
I0617
17:51:42.177371  3682 solver.cpp:248] Iteration 10000, loss = 0.0396007
I0617
17:51:42.177395  3682 solver.cpp:266] Iteration 10000, Testing net (#0)
I0617
17:51:42.268739  3682 solver.cpp:315]     Test net output #0: accuracy = 0.9776
I0617
17:51:42.268764  3682 solver.cpp:315]     Test net output #1: loss = 0.0865545 (* 1 = 0.0865545 loss)
I0617
17:51:42.268772  3682 solver.cpp:253] Optimization Done.
Accuracy: 0.978


As you can see the model achieves accuracy of 0.978. This great so far..

The deploy.protxt is :

input: "data"
input_dim
: 1
input_dim
: 4
input_dim
: 1
input_dim
: 1




layer
{
  name
: "fc1"
  type
: "InnerProduct"
  bottom
: "data"
  top
: "fc1"
  param
{
    lr_mult
: 1
    decay_mult
: 1
 
}
  param
{
    lr_mult
: 2
    decay_mult
: 0
 
}
  inner_product_param
{
    num_output
: 2
    weight_filler
{
      type
: "gaussian"
      std
: 0.01
   
}
    bias_filler
{
      type
: "constant"
      value
: 0
   
}
 
}
}


layer
{
  name
: "prob"
  type
: "Softmax"
  bottom
: "fc1"
  top
: "prob"
}

and the python code:
Here I use the same data used for training to make predictions:

import caffe
import pandas
import numpy as np
import sklearn
import sklearn.datasets
import sklearn.linear_model
from sklearn.metrics import accuracy_score




MODEL_FILE
= '/projects/Caffe_DeepLearning/Exp4/deploy.prototxt'
PRETRAINED
= '/projects/Caffe_DeepLearning/Exp4/train_iter_10000.caffemodel'




net
= caffe.Net(MODEL_FILE, PRETRAINED, caffe.TEST)
X
,y = sklearn.datasets.make_classification(n_samples=10000, n_features=4,n_redundant=0,random_state=69)




y_pred
=[]


for j in range(len(X)):
 feature
=np.array(X[j])
 f
=feature.reshape(4,1,1)  # my data are non-image and have 4 features, so the dim are 4,1,1
 g
=np.array(f).astype(np.uint8)
 predict
= net.forward_all(data=np.asarray([g]))
 y_pred
.append(predict['prob'].argmax())




acc
=accuracy_score(y, y_pred)
print acc


The model loads :

I0618 12:48:45.206326 11970 net.cpp:336] Input 0 -> data
I0618
12:48:45.206362 11970 layer_factory.hpp:74] Creating layer fc1
I0618
12:48:45.206377 11970 net.cpp:76] Creating Layer fc1
I0618
12:48:45.206384 11970 net.cpp:372] fc1 <- data
I0618
12:48:45.206393 11970 net.cpp:334] fc1 -> fc1
I0618
12:48:45.206410 11970 net.cpp:105] Setting up fc1
I0618
12:48:46.011014 11970 net.cpp:112] Top shape: 1 2 1 1 (2)
I0618
12:48:46.011075 11970 layer_factory.hpp:74] Creating layer prob
I0618
12:48:46.011097 11970 net.cpp:76] Creating Layer prob
I0618
12:48:46.011104 11970 net.cpp:372] prob <- fc1
I0618
12:48:46.011116 11970 net.cpp:334] prob -> prob
I0618
12:48:46.011128 11970 net.cpp:105] Setting up prob
I0618
12:48:46.538223 11970 net.cpp:112] Top shape: 1 2 1 1 (2)
I0618
12:48:46.538261 11970 net.cpp:165] prob does not need backward computation.
I0618
12:48:46.538267 11970 net.cpp:165] fc1 does not need backward computation.
I0618
12:48:46.538274 11970 net.cpp:201] This network produces output prob
I0618
12:48:46.538286 11970 net.cpp:446] Collecting Learning Rate and Weight Decay.
I0618
12:48:46.538298 11970 net.cpp:213] Network initialization done.
I0618
12:48:46.538305 11970 net.cpp:214] Memory required for data: 16
Accuracy:  0.4181



Here are some predictions with their labels:


1 [[[[ 0.52132756]]


 
[[ 0.47867242]]]]
0 [[[[ 0.52132756]]


 
[[ 0.47867242]]]]
0 [[[[  5.26498400e-10]]


 
[[  1.00000000e+00]]]]
1 [[[[  5.33656508e-10]]


 
[[  1.00000000e+00]]]]
1 [[[[ 0.50032425]]


 
[[ 0.49967578]]]]
1 [[[[  1.70592793e-11]]


 
[[  1.00000000e+00]]]]
0 [[[[ 0.57079709]]


 
[[ 0.42920291]]]]
0 [[[[ 0.]]


 
[[ 1.]]]]
1 [[[[  1.12530563e-20]]


 
[[  1.00000000e+00]]]]
1 [[[[ 0.52132756]]


 
[[ 0.47867242]]]]
1 [[[[  1.00000000e+00]]


 
[[  2.95583650e-31]]]]
1 [[[[ 0.00399983]]



 My questions is what am I doing wrong and the predictions obtained by python are random, or actually worse than random.
Any feedback would be much appreciated as at this point I am ready to give up on caffe.

Best,
Alex. 

Antony Zebraski

unread,
Jun 18, 2015, 8:29:15 PM6/18/15
to caffe...@googlegroups.com
Look at the net_surgery python notebook (examples/net_surgery.ipynb). It has a complete example of classifying an image (a cat, no less)
using a trained network. I'm a python n00b, but I was able to use that as a starting point to test on
my own network.

HTH.
...

AlexK

unread,
Jun 19, 2015, 11:40:48 AM6/19/15
to caffe...@googlegroups.com
Hi Antony,

Thanks for the suggestion. I have checked all the examples provided in Caffe. As you can see towards the end of my post I am able to generate predictions. My questions is that the predictions don't make sense, they are worse than random, that's my questions why they are not correct.


Best,
Alex.
Reply all
Reply to author
Forward
0 new messages