import h5py
f.create_dataset("Images", (maxSamples,3,imahe_height,image_width), dtype='float32')
f.create_dataset("Mask", (maxSamples,1,mask_height,mask_width), dtype='float32') #in practice you want mask_height to be equal to image_height
#write your data as numpy arrays (use a transformer for the image)
shape=(1,3,imh,imw)
transformer = caffe.io.Transformer({'data': shape})
transformer.set_mean('data', np.array([100,109,113]))
transformer.set_transpose('data', (2,0,1))
transformer.set_raw_scale('data',255.0)
n=0
for img,gt in zip(data,gt):
f["Images"][n] = transformer.preprocess(img)
f["Mask"][n] = gt.reshape((1,mask_height,mask_width))
n=n+1 #pardon my french
4. You do not need this step! you can just finetune from your model and replace "InnerProduct" layers by "Convolution". Of course by doing so, all weights in the fully connected part will be gone, but they are fast to train.
5. Here i'm not sure we need this : it seems there is this initialization which is possible :
layer{ type:"Deconvolution"
...
convolution_params{
...
weight_filler{
type: "bilinear"
}
}
6. about the solver, it seems it puts very high momentum and very small base learning rate (10^-10) for the un-normalzed softmax, i'm not sure to understand why...
--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/ed48deaa-8285-433c-b477-ae61115895b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
CONFLICT (content): Merge conflict in include/caffe/vision_layers.hpp
Hello Zizhao,Yes, I'm doing the 59-category scenario where I group all categories outside the 59-set into the background class. So my outputs have dimensions 60xHxW.Something I still don't understand when it comes to the number 5105:In solver.prototxt, you see these two lines:According to the solver of caffe's MNIST tutorial:test_iter: 5105# make test net, but don't invoke it from the solver itselftest_interval: 1000000# test_iter specifies how many forward passes the test should carry out.# In the case of MNIST, we have test batch size 100 and 100 test iterations,# covering the full 10,000 testing images.test_iter: 100# Carry out testing every 500 training iterations.test_interval: 500I don't get it. The batch size in the FCN trainval.prototxt is exactly 1. I'm guessing due to memory restrictions.This is what I think is going on, can you please correct me if I'm wrong:test_iter and test_interval don't have anything to do with the batch size.Since batch size is 1, the solver will perform a fwd pass on 5105 batches and calculate the gradients for each iteration. 1 iteration is equal to 1 image.test_interval is so high the solver never carries out testing. But we still see it printing the loss for each iteration, once for on the training subset and once on val.Thanks,Is this correct?Youssef
On Tue, Aug 18, 2015 at 10:26 PM, Zizhao Zhang <mr.zizh...@gmail.com> wrote:
Hi Youssef,I am quite new to segmentation task. But now I am more clear meanings.I thought you split the 5105 image into train and val (e.g., 3000 for train and 2105 for val). The reason why I am ask is that if you follow the Evan's FCN training instruction, the output layer is a 60 *H*W (so it is 59 categories + 1 background). That's how I infer you use all 5105 as train/val by non-overlapping split.However, the total PASCAL-Context webpage dataset (10,103) segmentation masks have more than 400+ categories. So if I you train in this one, which means the output of your last layer should be 400+*H*W. Are you training in this way? Or you set the labels outside the 59 categories as background.For the bug, it is totally clear now. Thank you so much for your great help.
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/3eIMYV0OlY8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/6dec0d0c-661e-49b5-bdf8-09baec48ec02%40googlegroups.com.
--Best Regards,Zizhao
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/3eIMYV0OlY8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/3f804e47-f960-4f92-a3fe-f0ad24164b77%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/656eb3d9-c9cd-4f60-ae18-f2c444b3c990%40googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/3eIMYV0OlY8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/42c3773d-a43f-4a91-a26d-d01948eebf79%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/fdf39bec-6427-4153-824f-841636ebc547%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/CACJLG3j%3DXCxxNBu_yshyrjqaQTHSFXZZ%3DqRbi2iVnbM75ZKRnw%40mail.gmail.com.
Hi everyone!
layer { name: "fc8-conv" type: "Convolution" bottom: "fc7-conv" top: "fc8-conv" convolution_param { num_output: num_of_classes kernel_size: 1 weight_filler { type: "gaussian" std: 1.0 } bias_filler { type: "constant" value: 0.0 } }}
layer { type: "Deconvolution"name: 'upscore' bottom: 'fc8-conv' top: 'upscore'param { lr_mult: 0}convolution_param { kernel_size: 64 stride: 32 pad: 16 num_output: num_of_classes group: num_of_classes weight_filler{ type: "constant" value: 1 } } }
#transformer init for preprocess pictures loaded with opencv cv2.imread(...)
shape=(1,3,imh,imw)
transformer = caffe.io.Transformer({'data': shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_mean('data', np.array([100,109,113]))
#transformer.set_raw_scale('data',255) #this will do weird thing if you let it
#transformer.set_channel_swap('data', (2,1,0)) # the reference model (caffenet) has channels in BGR order, so does opencv!no need for another swap
Hello Yu,
Hi guys,
--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/ed48deaa-8285-433c-b477-ae61115895b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
convolution_param { |
num_output: 2 |
kernel_size: 1 |
engine: CAFFE |
} |
Hi Youssef,I am kind of confused. You said you split train/val on 59-category segmentation results (totally 5105 labeled images) and you also said use 5105 as validation data. I want to make sure you are not using PASCAL full labeled training data with 400+ categories and 10000+ image right? I think you split this 5105 images as train/val right?Thanks for your help !Zizhao
On Tuesday, August 18, 2015 at 12:05:04 PM UTC-4, Youssef Kashef wrote:Hello Zizhao,I sort of guessed. My train/val split is based on the 59-category segmentation results reported on the PASCAL-Context webpage. When you scroll down to the "Project Specific Downloads" section, you can download segmentation results generated by Motthagi et al.'s CVPR paper from 2014. The generated segmentations for 5105 images. Those are the ones I grouped into the validation set. I don't know what their splitting strategy was, but curious to learn what it is.Here's a link to a text file with those 5105 image names (excluding extension).Youssef
On Monday, August 17, 2015 at 11:32:01 PM UTC+2, zzz wrote:Hi Youssef,Thanks for your details.For PASCAL-Context database, there are 5105 training images. May I ask how do your split the train/val data?Thanks in advance for helping!Zizhao
I highly recommend it so you can compare with other work that use the same splitting.
I use the val_59.txt file to select which images go into to the validation set. If an image is not in that list it is assigned to the train test. This is specific to the PASCAL-Context dataset where the ground truth images are saved as .mat files. Because this is for solving a pixel segmentation task, there's no single label per image but a whole label file (the .mat file) for each image.
...
Hi changingivan,
...
databaseRGB = lmdb.DB(lmdbRGBpath,'MAPSIZE', 5048576000, 'NOLOCK', 'true');
databaseGT = lmdb.DB(lmdbGTPath,'MAPSIZE', 5048576000, 'NOLOCK', 'true');
%RGB
imRGB = imread(fileRGBpath);
imRGB = imRGB(:, :, [3, 2, 1]); % permute channels from RGB to BGR
imRGB = permute(imRGB, [2, 1, 3]); % flip width and height
%datumRGB = caffe_proto_('toEncodedDatum', imRGB, 0);
databaseRGB.put(key, imRGB);
%GT
[imGT, ~] = imread(fileGTpath);
imGT = permute(imGT, [2, 1]); % flip width and height
imGT = reshape(imGT,[1 size(imGT)]); %add singleton dimension
%datumMASK = caffe_proto_('toEncodedDatum', imGT, 0);
databaseGT.put(keyGT, imGT);
def imgs_to_lmdb_GT(paths_src, path_dst):
in_db = lmdb.open(path_dst, map_size=int(1e12))
with in_db.begin(write=True) as in_txn:
for in_idx, in_ in enumerate(paths_src):
# load image:
# - as np.uint8 {0, ..., 255}
# - in BGR (switch from RGB)
# - in Channel x Height x Width order (switch from H x W x C)
im = np.array(Image.open(in_)) # or load whatever ndarray you need
print im.shape
im = im [None, :]
im_dat = caffe.io.array_to_datum(im)
in_txn.put('{:0>10d}'.format(in_idx), im_dat.SerializeToString())
in_db.close()
return 0
def imgs_to_lmdb(paths_src, path_dst):
in_db = lmdb.open(path_dst, map_size=int(1e12))
with in_db.begin(write=True) as in_txn:
for in_idx, in_ in enumerate(paths_src):
# load image:
# - as np.uint8 {0, ..., 255}
# - in BGR (switch from RGB)
# - in Channel x Height x Width order (switch from H x W x C)
im = np.array(Image.open(in_)) # or load whatever ndarray you need
im = im[:,:,::-1]
im = im.transpose((2,0,1))
im_dat = caffe.io.array_to_datum(im)
in_txn.put('{:0>10d}'.format(in_idx), im_dat.SerializeToString())
in_db.close()
return 0
...
I'm having some trouble with the label matrix.I've created an image with 0 as background and 1...K marking pixels belonging to each class.I've created the lmdb according to PR#1698 for both images and labels.when I run the net I get this error:Check failed: outer_num_ * inner_num_ == bottom[1]->count() (187500 vs. 562500) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.As far as I can tell, the problem is that my labels are saved with 3 channels and not 1, but I couldn't figure out how to save them with 1 channel.Any help would be appreciated
THX
On Monday, June 8, 2015 at 5:00:23 AM UTC+3, Kien Nguyen Thanh wrote:
Hi all,
Is there anyone managing to run semantic segmentation FCN models on the future branch of Caffe? I have been around with the previous version of Caffe sometime but now having trouble installing and running, testing the model provided in the Model Zoo.
1) When installing using the same procedure as previous, the making commands (make pycaffe, make all and make test) return errors.
2) How to prepare image data for segmentation. Are we using the same python script "classify.py" to segment the probe images?
I appreciate any ideas. Thanks in advance.
How high of a loss is too high? The paper doesn't seem to mention much on loss values for the different datasets.Is it correct to assume that the Euclidean loss generated is normalized and independent of the number of classes and image dimensions?
On Monday, August 24, 2015 at 8:40:30 PM UTC+2, zzz wrote:Hi,Anyone has progresses of training FCN. How to solve the high loss issue?
On Friday, August 21, 2015 at 2:44:57 AM UTC-4, Ben wrote:It seems that we're facing similar problems, I'll update if any progress.On Fri, Aug 21, 2015 at 9:38 AM, Fatemeh Saleh <fateme...@gmail.com> wrote:Hi,I used SBD training samples as mentioned in the paper which is 8498 images and validation set of 736 images which is mentioned in the foot not of the paper. The loss in around 500K. It decreases after 10K but still is high with strong variations.
--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/ed48deaa-8285-433c-b477-ae61115895b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To view this discussion on the web visit--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/3eIMYV0OlY8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
...
My dataset has 2 classes; with 1000 training images of (5,256,256) also corresponding ground truth data (1,256,256) which is a binary image either 0 or 1 to represent the 2 classes.
When training in solve.py you use the existing caffemodel which I assume is 3-channel ; but as I want to implement in on my 5 channel dataset can I use the same model provided ?
...
--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/3eIMYV0OlY8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/0782508c-3d77-4165-ab01-8d8fa84451b0%40googlegroups.com.
...
layer { name: "concat" bottom: "in1" bottom: "in2" top: "out" type: "Concat" concat_param { axis: 1 } }
...
...
Hello everyone,I've been trying to train FCN-32s Fully Convolutional Semantic Segmentation on PASCAL-Context but keep getting very high loss values regardless of how many iterations:"Train net output #0: loss = 767455 (* 1 = 767455 loss)". Sometimes it would go as low as 440K but then it'll just jump back up to something higher and oscillate.Ignoring the high and letting it go through 80K iterations, I still end up with a network that produces all zero output.I can't tell what's throwing it off like that.My procedure in detail:
- Follow instructions in future.sh from longjon:future, except that I apply the PR merges to BVLC:master instead of longjon:master. Building off of longjon:future results in cuDNN build errors like here. Applying some of the PR merges to BVLC:master is redundant since they've already been merged to the master branch.
- Build Caffe with cuda 6.5, cuDNN. I've tried a CPU only build and got the same high-loss behavior, so I don't think it's related to GPU or the driver (then again, I only let it run for 5K iterations).
- Generate LMDB for the PASCAL-Context database. The lmdb generation script is built around Evan Shellhammer's python snippet in this comment in PR#1698.
- The images are stored as numpy.uin8 with shape C x H x W, with C=3
- The ground truth is stored as numpy.int64 with shape C x H x W, with C=1
- The order in which the images are stored is the same as the ground truth. One lmdb for the images and one for the labels. I have two pairs of each to reflect the train/val split.
- Use net surgery to turn VGG-16 into a fully convolutional model. For this I pretty much followed the net surgery example and used the layer definitions from the FCN-32s' trainval.prototxt in Model Zoo.
- Not sure I did this one right though. The output I get for the cat image is still a single element 2D matrix.
- I've tried using VGG-16 fcn weights from HyeonwooNoh/DeconvNet/model/get_model.sh but still getting the same behavior.
- How can I verify my fully convolutional VGG-16 better?
- Apply the solve.py step for initialzing the deconv. parameters. According to shelhammer''s post here, not doing this, could leave things stuck at zero.
- What's a good way of verifying the initialization is correct? I'm worried that the problem is there.
- solver.protoxt and trainval.prototxt are identical to those shared on Model Zoo. They only differ in the paths to the lmdbs.
- I start the training, I start getting "Train net output #0: loss = 767455 (* 1 = 767455 loss)" sometimes it will go down by several 100K, but I never see values < 10.0 that I've seen some report.
I could really use some help in figuring out what I'm doing wrong and understanding why the loss is so high. It seems that people have figured out how to train these fcn's without the need of a detailed guide, so it seems I'm missing a critical step somewhere.
Thank you
On Wednesday, July 8, 2015 at 5:53:34 AM UTC+2, Gavin Hackeling wrote:Yes, the problem appears to be that your labels have three channels instead of one channel. Assuming that your image has the shape (C, H, W) and that the channel containing your integer class labels is c, you can index that channel using "img = img[c, :, :]".
On Sunday, July 5, 2015 at 3:01:06 AM UTC-4, eran paz wrote:HiWere you able to run the network?I'm having some trouble with the label matrix.I've created an image with 0 as background and 1...K marking pixels belonging to each class.I've created the lmdb accordin
...
...
...
...
...
Hi all, I now try to export the prediction from the FCN using C++ for my project. But I end up with some very strange images. Could anyone help me with this please?In my problem I tried to build my own cnn and there are only two classes to segment (0 and 1)Here is my code to read the output of the forward pass:////////////////////////////////////////////////////////////////const vector<Blob<float>*>& result = caffe_net.Forward(bottom_vec, &iter_loss); // forward passconst float* result_vec = result[0]->cpu_data();// generate prediction from the output vector and store it in Matcv::Mat srcC = cv::Mat::zeros(cv::Size(512,384), CV_32FC1);int nl= srcC.rows; //row number, heightint nc= srcC.cols; //col number, widthfor (int j=0; j<nl; j++) {float* data= srcC.ptr<float>(j);for (int i=0; i<nc; i++) {if (result_vec[i+j*nc+datum.height()*datum.width()] > result_vec[i+j*nc]);// compare the value from different class and generate the predictiondata[i] = 255;}
...
Hello everyone,I have been trying to get a FCN working and I have read Youssef's instructions on a previous instructions but now I am stuck at building the lmdb (step 3). I would like to train the network with the Pascal Context dataset.I have looked into PR#1698 and also into Toru's script.Probably, I am missing something but for me is not clear which script and/or what paths to include in the scripts.Can you please clarify the usage?Thanks in advance.Best regards,Gonçalo Cruz
On Thursday, February 4, 2016 at 6:45:13 PM UTC, Jianyu Lin wrote:Hi all, I now try to export the prediction from the FCN using C++ for my project. But I end up with some very strange images. Could anyone help me with this please?In my problem I tried to build my own cnn and there are only two classes to segment (0 and 1)Here is my code to read the output of the forward pass:////////////////////////////////////////////////////////////////const vector<Blob<float>*>& result = caffe_net.Forward(bottom_vec, &iter_loss); // forward passconst float* result_vec = result[0]->cpu_data();// generate prediction from the output vector and store it in Matcv::Mat srcC
...
Hello Gonçalo,I wrote some python code that follows the instructions from the PR#1698 comment. It includes a wrapper function specifically for PASCAL Context.You can find them here. To run them you need to download the repo and to follow some brief installation instructions.You basically tell it where to find the PASCAL Context dataset files and a text file that lists which samples go into the validation set.You end up with 4 lmdb files (1 x training images + 1x training ground truth + 1 x validation images + 1x validation ground truth)Hope this helps,Youssef
On Thursday, February 18, 2016 at 3:25:12 PM UTC+1, Gonçalo Cruz wrote:Hello everyone,I have been trying to get a FCN working and I have read Youssef's instructions on a previous instructions but now I am stuck at building the lmdb (step 3). I would like to train the network with the Pascal Context dataset.I have looked into PR#1698 and also into Toru's script.Probably, I am missing something but for me is not clear which script and/or what paths to include in the scripts.Can you please clarify the usage?Thanks in advance.Best regards,Gonçalo Cruz
On Thursday, February 4, 2016 at 6:45:13 PM UTC, Jianyu Lin wrote:
...
Hello everyone,I've been trying to train FCN-32s Fully Convolutional Semantic Segmentation on PASCAL-Context but keep getting very high loss values regardless of how many iterations:"Train net output #0: loss = 767455 (* 1 = 767455 loss)". Sometimes it would go as low as 440K but then it'll just jump back up to something higher and oscillate.Ignoring the high and letting it go through 80K iterations, I still end up with a network that produces all zero output.I can't tell what's throwing it off like that.My procedure in detail:
- Follow instructions in future.sh from longjon:future, except that I apply the PR merges to BVLC:master instead of longjon:master. Building off of longjon:future results in cuDNN build errors like here. Applying some of the PR merges to BVLC:master is redundant since they've already been merged to the master branch.
- Build Caffe with cuda 6.5, cuDNN. I've tried a CPU only build and got the same high-loss behavior, so I don't think it's related to GPU or the driver (then again, I only let it run for 5K iterations).
- Generate LMDB for the PASCAL-Context database. The lmdb generation script is built around Evan Shellhammer's python snippet in this comment in PR#1698.
- The images are stored as numpy.uin8 with shape C x H x W, with C=3
- The ground truth is stored as numpy.int64 with shape C x H x W, with C=1
- The order in which the images are stored is the same as the ground truth. One lmdb for the images and one for the labels. I have two pairs of each to reflect the train/val split.
- Use net surgery to turn VGG-16 into a fully convolutional model. For this I pretty much followed the net surgery example and used the layer definitions from the FCN-32s' trainval.prototxt in Model Zoo.
- Not sure I did this one right though. The output I get for the cat image is still a single element 2D matrix.
- I've tried using VGG-16 fcn weights from HyeonwooNoh/DeconvNet/model/get_model.sh but still getting the same behavior.
- How can I verify my fully convolutional VGG-16 better?
- Apply the solve.py step for initialzing the deconv. parameters. According to shelhammer''s post here, not doing this, could leave things stuck at zero.
- What's a good way of verifying the initialization is correct? I'm worried that the problem is there.
- solver.protoxt and trainval.prototxt are identical to those shared on Model Zoo. They only differ in the paths to the lmdbs.
- I start the training, I start getting "Train net output #0: loss = 767455 (* 1 = 767455 loss)" sometimes it will go down by several 100K, but I never see values < 10.0 that I've seen some report.
I could really use some help in figuring out what I'm doing wrong and understanding why the loss is so high. It seems that people have figured out how to train these fcn's without the need of a detailed guide, so it seems I'm missing a critical step somewhere.
Thank you
On Wednesday, July 8, 2015 at 5:53:34 AM UTC+2, Gavin Hackeling wrote:
Yes, the problem appears to be that your labels have three channels instead of one channel. Assuming that your image has the shape (C, H, W) and that the channel containing your integer class labels is c, you can index that channel using "img = img[c, :, :]".
On Sunday, July 5, 2015 at 3:01:06 AM UTC-4, eran paz wrote:
HiWere you able to run the network?I'm having some trouble with the label matrix.I've created an image with 0 as background and 1...K marking pixels belonging to each class.
I've created the lmdb according to PR#1698 for both images and labels.when I run the net I get this error:Check failed: outer_num_ * inner_num_ == bottom[1]->count() (187500 vs. 562500) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.As far as I can tell, the problem is that my labels are saved with 3 channels and not 1, but I couldn't figure out how to save them with 1 channel.Any help would be appreciated
THX
...