accuracy=0?

Jalen Hawkins

unread,

Jun 22, 2016, 1:13:09 PM6/22/16

to Caffe Users

I am running a small example to test my understanding of the system. I am creating a small network (40 training files and 8 validation files with 4 classes) and while training the network the error I have ran into is my accuracy is 0 until the 120th iteration where it jumps to 1 then drops back to 0. I have dropped my lr to .0001 and my loss stays in the range of 3-5% does anyone know any solutions.

I0622 13:08:36.713556 9231 solver.cpp:280] Learning Rate Policy: step

I0622 13:08:36.717217 9231 solver.cpp:337] Iteration 0, Testing net (#0)

I0622 13:08:36.842509 9231 solver.cpp:404] Test net output #0: accuracy = 0

I0622 13:08:36.842541 9231 solver.cpp:404] Test net output #1: loss = 7.67709 (* 1 = 7.67709 loss)

I0622 13:08:37.158406 9231 solver.cpp:228] Iteration 0, loss = 6.35291

I0622 13:08:37.158449 9231 solver.cpp:244] Train net output #0: loss = 6.35291 (* 1 = 6.35291 loss)

I0622 13:08:37.158468 9231 sgd_solver.cpp:106] Iteration 0, lr = 0.0001

I0622 13:08:50.262631 9231 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_40.caffemodel

I0622 13:08:50.324101 9231 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_40.solverstate

I0622 13:08:50.362587 9231 solver.cpp:337] Iteration 40, Testing net (#0)

I0622 13:08:50.486214 9231 solver.cpp:404] Test net output #0: accuracy = 0

I0622 13:08:50.486261 9231 solver.cpp:404] Test net output #1: loss = 5.00905 (* 1 = 5.00905 loss)

I0622 13:08:50.793370 9231 solver.cpp:228] Iteration 40, loss = 3.55094

I0622 13:08:50.793406 9231 solver.cpp:244] Train net output #0: loss = 3.55094 (* 1 = 3.55094 loss)

I0622 13:08:50.793411 9231 sgd_solver.cpp:106] Iteration 40, lr = 1e-12

I0622 13:09:03.913767 9231 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_80.caffemodel

I0622 13:09:03.979215 9231 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_80.solverstate

I0622 13:09:04.017534 9231 solver.cpp:337] Iteration 80, Testing net (#0)

I0622 13:09:04.135341 9231 solver.cpp:404] Test net output #0: accuracy = 0

I0622 13:09:04.135375 9231 solver.cpp:404] Test net output #1: loss = 3.70063 (* 1 = 3.70063 loss)

I0622 13:09:04.444550 9231 solver.cpp:228] Iteration 80, loss = 3.54556

I0622 13:09:04.444594 9231 solver.cpp:244] Train net output #0: loss = 3.54556 (* 1 = 3.54556 loss)

I0622 13:09:04.444603 9231 sgd_solver.cpp:106] Iteration 80, lr = 1e-20

I0622 13:09:17.573283 9231 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_120.caffemodel

I0622 13:09:17.639202 9231 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_120.solverstate

I0622 13:09:17.677817 9231 solver.cpp:337] Iteration 120, Testing net (#0)

I0622 13:09:17.796124 9231 solver.cpp:404] Test net output #0: accuracy = 1

I0622 13:09:17.796161 9231 solver.cpp:404] Test net output #1: loss = 0.213932 (* 1 = 0.213932 loss)

I0622 13:09:18.104405 9231 solver.cpp:228] Iteration 120, loss = 3.15777

I0622 13:09:18.104449 9231 solver.cpp:244] Train net output #0: loss = 3.15777 (* 1 = 3.15777 loss)

I0622 13:09:18.104467 9231 sgd_solver.cpp:106] Iteration 120, lr = 1e-28

I0622 13:09:32.082479 9231 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_160.caffemodel

I0622 13:09:32.147368 9231 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_160.solverstate

I0622 13:09:32.186655 9231 solver.cpp:337] Iteration 160, Testing net (#0)

I0622 13:09:32.311929 9231 solver.cpp:404] Test net output #0: accuracy = 0

I0622 13:09:32.311976 9231 solver.cpp:404] Test net output #1: loss = 3.70125 (* 1 = 3.70125 loss)

I0622 13:09:32.618758 9231 solver.cpp:228] Iteration 160, loss = 5.15814

I0622 13:09:32.618805 9231 solver.cpp:244] Train net output #0: loss = 5.15814 (* 1 = 5.15814 loss)

I0622 13:09:32.618824 9231 sgd_solver.cpp:106] Iteration 160, lr = 1e-36

I0622 13:09:47.952828 9231 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_200.caffemodel

I0622 13:09:48.017726 9231 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_200.solverstate

I0622 13:09:48.171784 9231 solver.cpp:317] Iteration 200, loss = 4.59124

I0622 13:09:48.171813 9231 solver.cpp:337] Iteration 200, Testing net (#0)

I0622 13:09:48.288655 9231 solver.cpp:404] Test net output #0: accuracy = 0

I0622 13:09:48.288714 9231 solver.cpp:404] Test net output #1: loss = 4.11947 (* 1 = 4.11947 loss)

I0622 13:09:48.288718 9231 solver.cpp:322] Optimization Done.

I0622 13:09:48.288722 9231 caffe.cpp:222] Optimization Done.

Daniel Moodie

unread,

Jun 22, 2016, 3:59:42 PM6/22/16

to Caffe Users

Hello,

What type of input layer are you using? If using hdf5 try setting shuffle to on.

I find similar issues when a have ordered input data such that a single batch may contain only one class, thus if initially your network only outputs class 0, then accuracy will be 0 until it runs into a grouping of class 0 samples.

Furthermore, what is your network architecture?

Jalen Hawkins

unread,

Jun 24, 2016, 8:21:27 AM6/24/16

to Caffe Users

I am using LMDB. I have 4 set classes (0-3) 10 training and 2 validation pictures per class.

layer {

type: "Data"

top: "data"

top: "label"

include {

phase: TRAIN

}

transform_param {

mirror: true

crop_size: 256

mean_file: "/home/addisonbe/Desktop/sample/imagenet_mean1.binaryproto"

}

# mean pixel / channel-wise mean instead of mean image

# transform_param {

# crop_size: 227

# mean_value: 104

# mean_value: 117

# mean_value: 123

# mirror: true

# }

data_param {

source: "/home/addisonbe/Desktop/sample/example_train_lmdb"

batch_size: 1

backend: LMDB

}

layer {

type: "Data"

top: "data"

top: "label"

include {

phase: TEST

}

transform_param {

mirror: false

crop_size: 256

mean_file: "/home/addisonbe/Desktop/sample/imagenet_mean1.binaryproto"

}

# mean pixel / channel-wise mean instead of mean image

# transform_param {

# crop_size: 227

# mean_value: 104

# mean_value: 117

# mean_value: 123

# mirror: false

# }

data_param {

source: "/home/addisonbe/Desktop/sample/example_val_lmdb"

batch_size: 1

backend: LMDB

}

layer {

type: "Convolution"

bottom: "data"

top: "conv1"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 96

kernel_size: 11

stride: 4

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 0

}

layer {

type: "ReLU"

bottom: "conv1"

top: "conv1"

}

layer {

type: "Pooling"

bottom: "conv1"

top: "pool1"

pooling_param {

pool: MAX

kernel_size: 3

stride: 2

}

layer {

type: "LRN"

bottom: "pool1"

top: "norm1"

lrn_param {

local_size: 5

alpha: 0.0001

beta: 0.75

}

layer {

type: "Convolution"

bottom: "norm1"

top: "conv2"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 256

pad: 2

kernel_size: 5

group: 2

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "conv2"

top: "conv2"

}

layer {

type: "Pooling"

bottom: "conv2"

top: "pool2"

pooling_param {

pool: MAX

kernel_size: 3

stride: 2

}

layer {

type: "LRN"

bottom: "pool2"

top: "norm2"

lrn_param {

local_size: 5

alpha: 0.0001

beta: 0.75

}

layer {

type: "Convolution"

bottom: "norm2"

top: "conv3"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 384

pad: 1

kernel_size: 3

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 0

}

layer {

type: "ReLU"

bottom: "conv3"

top: "conv3"

}

layer {

type: "Convolution"

bottom: "conv3"

top: "conv4"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 384

pad: 1

kernel_size: 3

group: 2

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "conv4"

top: "conv4"

}

layer {

type: "Convolution"

bottom: "conv4"

top: "conv5"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 256

pad: 1

kernel_size: 3

group: 2

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "conv5"

top: "conv5"

}

layer {

type: "Pooling"

bottom: "conv5"

top: "pool5"

pooling_param {

pool: MAX

kernel_size: 3

stride: 2

}

layer {

type: "InnerProduct"

bottom: "pool5"

top: "fc6"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

inner_product_param {

num_output: 2

weight_filler {

type: "gaussian"

std: 0.005

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "fc6"

top: "fc6"

}

layer {

type: "Dropout"

bottom: "fc6"

top: "fc6"

dropout_param {

dropout_ratio: 0.5

}

layer {

type: "InnerProduct"

bottom: "fc6"

top: "fc7"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

inner_product_param {

num_output: 4096

weight_filler {

type: "gaussian"

std: 0.005

}

bias_filler {

type: "constant"

value: 1

}

layer {

type: "ReLU"

bottom: "fc7"

top: "fc7"

}

layer {

type: "Dropout"

bottom: "fc7"

top: "fc7"

dropout_param {

dropout_ratio: 0.5

}

layer {

type: "InnerProduct"

bottom: "fc7"

top: "fc8"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

inner_product_param {

num_output: 1000

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 0

}

layer {

type: "Accuracy"

bottom: "fc8"

bottom: "label"

top: "accuracy"

include {

phase: TEST

}

layer {

type: "SoftmaxWithLoss"

bottom: "fc8"

bottom: "label"

top: "loss"

}

Jalen Hawkins

unread,

Jun 24, 2016, 2:16:19 PM6/24/16

to Caffe Users

okay so i found a few slight mechanical errors such as the name of some of the pictures not matching the .txt file. I have also made changers to my .prototxt files which led to me being able to get my accuracy up to .25 but i am still not satisfied i was more so aiming for an accuracy around 50% are there any other changes i could use to boost it?

I0624 12:01:48.271173 13904 solver.cpp:337] Iteration 0, Testing net (#0)

I0624 12:01:48.772738 13904 solver.cpp:404] Test net output #0: accuracy = 0

I0624 12:01:48.772775 13904 solver.cpp:404] Test net output #1: loss = 7.55048 (* 1 = 7.55048 loss)

I0624 12:01:49.080332 13904 solver.cpp:228] Iteration 0, loss = 7.99193

I0624 12:01:49.080377 13904 solver.cpp:244] Train net output #0: loss = 7.99193 (* 1 = 7.99193 loss)

I0624 12:01:49.080384 13904 sgd_solver.cpp:106] Iteration 0, lr = 0.0001

I0624 12:02:03.105518 13904 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_40.caffemodel

I0624 12:02:03.165858 13904 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_40.solverstate

I0624 12:02:03.205032 13904 solver.cpp:337] Iteration 40, Testing net (#0)

I0624 12:02:03.669890 13904 solver.cpp:404] Test net output #0: accuracy = 0.25

I0624 12:02:03.669930 13904 solver.cpp:404] Test net output #1: loss = 2.78939 (* 1 = 2.78939 loss)

I0624 12:02:03.972311 13904 solver.cpp:228] Iteration 40, loss = 0.531824

I0624 12:02:03.972353 13904 solver.cpp:244] Train net output #0: loss = 0.531824 (* 1 = 0.531824 loss)

I0624 12:02:03.972362 13904 sgd_solver.cpp:106] Iteration 40, lr = 1e-08

I0624 12:02:17.054237 13904 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_80.caffemodel

I0624 12:02:17.118906 13904 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_80.solverstate

I0624 12:02:17.157935 13904 solver.cpp:337] Iteration 80, Testing net (#0)

I0624 12:02:17.616801 13904 solver.cpp:404] Test net output #0: accuracy = 0.25

I0624 12:02:17.616839 13904 solver.cpp:404] Test net output #1: loss = 6.77176 (* 1 = 6.77176 loss)

I0624 12:02:17.920231 13904 solver.cpp:228] Iteration 80, loss = 1.0507

I0624 12:02:17.920266 13904 solver.cpp:244] Train net output #0: loss = 1.0507 (* 1 = 1.0507 loss)

I0624 12:02:17.920285 13904 sgd_solver.cpp:106] Iteration 80, lr = 1e-12

I0624 12:02:31.004828 13904 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_120.caffemodel

I0624 12:02:31.070490 13904 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_120.solverstate

I0624 12:02:31.109570 13904 solver.cpp:337] Iteration 120, Testing net (#0)

I0624 12:02:31.573617 13904 solver.cpp:404] Test net output #0: accuracy = 0.25

I0624 12:02:31.573657 13904 solver.cpp:404] Test net output #1: loss = 2.76008 (* 1 = 2.76008 loss)

I0624 12:02:31.881237 13904 solver.cpp:228] Iteration 120, loss = 1.41541

I0624 12:02:31.881274 13904 solver.cpp:244] Train net output #0: loss = 1.41541 (* 1 = 1.41541 loss)

I0624 12:02:31.881284 13904 sgd_solver.cpp:106] Iteration 120, lr = 1e-16

I0624 12:02:44.980685 13904 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_160.caffemodel

I0624 12:02:45.045682 13904 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_160.solverstate

I0624 12:02:45.084786 13904 solver.cpp:337] Iteration 160, Testing net (#0)

I0624 12:02:45.552296 13904 solver.cpp:404] Test net output #0: accuracy = 0.25

I0624 12:02:45.552331 13904 solver.cpp:404] Test net output #1: loss = 6.77081 (* 1 = 6.77081 loss)

I0624 12:02:45.863262 13904 solver.cpp:228] Iteration 160, loss = 0.728506

I0624 12:02:45.863302 13904 solver.cpp:244] Train net output #0: loss = 0.728505 (* 1 = 0.728505 loss)

I0624 12:02:45.863312 13904 sgd_solver.cpp:106] Iteration 160, lr = 1e-20

I0624 12:02:59.033921 13904 solver.cpp:454] Snapshotting to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_200.caffemodel

I0624 12:02:59.420615 13904 sgd_solver.cpp:273] Snapshotting solver state to binary proto file /home/addisonbe/Desktop/sample/snapshot/caffenet_train_iter_200.solverstate

I0624 12:02:59.574303 13904 solver.cpp:317] Iteration 200, loss = 1.98218

I0624 12:02:59.574339 13904 solver.cpp:337] Iteration 200, Testing net (#0)

I0624 12:03:00.036310 13904 solver.cpp:404] Test net output #0: accuracy = 0.25

I0624 12:03:00.036350 13904 solver.cpp:404] Test net output #1: loss = 2.76008 (* 1 = 2.76008 loss)

I0624 12:03:00.036355 13904 solver.cpp:322] Optimization Done.

I0624 12:03:00.036356 13904 caffe.cpp:222] Optimization Done.

net: "models/bvlc_reference_caffenet/train_val1.prototxt"

test_iter: 4

test_interval: 40

base_lr: 0.0001

lr_policy: "step"

gamma: 0.1

stepsize: 10

display: 40

max_iter: 200

momentum: 0.9

weight_decay: 0.0005

snapshot: 40

snapshot_prefix: "/home/addisonbe/Desktop/sample/snapshot/caffenet_train"

solver_mode: CPU

Reply all

Reply to author

Forward