training FCN32s-RGBD on NYUDv2

An Le

unread,

Jul 20, 2016, 8:40:29 AM7/20/16

to Caffe Users

Hi,

I was looking for a pre-trained FCN for NYUD dataset that uses both RGB-D data but couldn't find one (there're only the net arch and solver provided) so I decided to do the training myself.

Looking into the solve.py, I've seen a reference to VGG caffemodel, but I couldn't find it provided along with FCN nor in the VGG model zoo, so I used Andrej's version at http://cs.stanford.edu/people/karpathy/vgg_train_val.prototxt, then I ran solve.py.

However, after 21k iteration, the loss still seems to fluctuate between 1e6 and 1e5. It's started from around 8e5 and seems to descend a bit to around 3e5 after the first couples of few (thousand) iterations, but then exploded to around 7e5 with some spikes jumping to even 1e7 or more. I don't know if it's a normal behavior and should leave it continuous running or not. I think it could be because of high learning rate, so I've set it to step with gamma of 0.1 and step size 5000, but it didn't seem to help much.

One suspicion could due to the fact that VGG_train_val.prototxt pointing to ILSVRC'12 dataset, not PASCAL VOC, but since I'm using NYUD v2, I'm not sure if it's really relevant?

Best,

skeptic.sisyphus

unread,

Jul 20, 2016, 2:04:15 PM7/20/16

to Caffe Users

Hi,

I am having the similar problem and i have also posted a question here.

https://groups.google.com/forum/#!topic/caffe-users/m1IX7_VcH8U

Would you tell how you've structured your input data. More specifically, are your matrices in .mat file HxW dimension or 1xHxW dimension. If it is the latter than dont you get the same error as i have been receiving?

The problem with the loss is the same with me as well it stays around 3e7 all the time.

Best,

Lê Hoàng Ân

unread,

Jul 21, 2016, 5:08:03 AM7/21/16

to caffe...@googlegroups.com

Hi,

Great to hear that I'm not the only one experience this problem. I'm not sure that I understand your question correctly, I used the provided nyud_layers.py for the input layer, so I guess it does all the hard work. In detail, here's some of my first layer dimensions:

color				(1, 3, 425, 560)
depth				(1, 1, 425, 560)
label				(1, 1, 425, 560)
data				(1, 4, 425, 560)

Hope it helps!

--
You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/PTpXo3z-HBc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/7d69d262-591b-41d2-84c4-f12cd3919f64%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Lê Hoàng Ân

unread,

Jul 21, 2016, 5:30:14 AM7/21/16

to caffe...@googlegroups.com, Rafid Siddiqui

Hi,

I've actually got some problems with that load_label function. I don't know if it's the same problem, but the command

label = scipy.io.loadmat('{}/segmentation/img_{}.mat'.format(self.nyud_dir, idx))['segmentation'].astype(np.uint8)

does not load the data correct. Actually, it raised 2 errors: a bad key error (since there's no 'segmentation' key in the mat file) and error type conversion since the loaded matrix is kinda messy.

So I had to modify it a bit, but as fas as I remember, the label was a 2D (probably WxH) array

On Thu, Jul 21, 2016 at 11:18 AM, Rafid Siddiqui <jawad...@gmail.com> wrote:

Hi,

Thanks for the info. I am actually not using the NYUD dataset rather i have a different 4 channel images. However, i am using the same procedure as for NYUD dataset. I am preparing the dataset for the nyud_layers.py. As far as i understood, it reads a .mat file for labels for each groundtruth image. Therefore, i am creating such .mat files.
Now the question is should these raw .mat files contain a matrix of size 1xHxW or just 2D matrix of size HxW only? With only 2D matrix in the .mat file, i get the FCN running but the loss does not descent. With an additional singleton dimension in the matrix (i.e. 1xHxW) the FCN code crash with the error saying it can only handle 4 channels or less. The reason i am trying to use singleton dimension is because in nyud_layers code it states:
def load_label(self, idx):
        """
        Load label image as 1 x height x width integer array of label indices.
        Shift labels so that classes are 0-39 and void is 255 (to ignore it).
        The leading singleton dimension is required by the loss.
        """

...

So, i am not sure if i am providing the data correctly or not. Is singleton dimension must for labels?

On 2016-07-21 11:06, Lê Hoàng Ân wrote:

Hi,

Great to hear that I'm not the only one experience this problem. I'm not sure that I understand your question correctly, I used the provided nyud_layers.py for the input layer, so I guess it does all the hard work. In detail, here's some of my first layer dimensions:
color				(1, 3, 425, 560)
depth				(1, 1, 425, 560)
label				(1, 1, 425, 560)
data				(1, 4, 425, 560)
Hope it helps!

On Wed, Jul 20, 2016 at 8:04 PM, skeptic.sisyphus <jawad...@gmail.com> wrote:

--

You received this message because you are subscribed to a topic in the Google Groups "Caffe Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/caffe-users/PTpXo3z-HBc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.

Lê Hoàng Ân

unread,

Jul 21, 2016, 8:07:45 AM7/21/16

to caffe...@googlegroups.com, Rafid Siddiqui

Yes the dimension at the command is surely WxH because later in the load_label function they're having

label = label[np.newaxis, ...]

So I guess the dimension is now matched the requirement.

--

Lê Hoàng Ân

On Thu, Jul 21, 2016 at 12:01 PM, Rafid Siddiqui <jawad...@gmail.com> wrote:

As it appears, this commands loads a mat file for each label image but if i look into the NYUD dataset it has a single large mat file of size HxWxN where N is the number of images. So, did you split the NYUD dataset or is it some other old version of the dataset that you have?
If you are sure that the dimension of the data was HxW, then I guess the problem might be with the absence of singleton dimension as comments in nyud_layers file say clearly that it is needed for computation of loss.
In your previous message you listed the layer dimensions, there i can see the singleton dimension, would you tell how you obtained those values?

skeptic.sisyphus

unread,

Jul 21, 2016, 11:08:17 AM7/21/16

to Caffe Users, jawad...@gmail.com

you are right. The problem is not with the dimension. It seems the problem is with weights becoming Nan. The layer conv1_2 has Nan weights. While conv1_1_bgrd is not. Is it similar with you as well?

skeptic.sisyphus

unread,

Jul 22, 2016, 2:53:39 AM7/22/16

to Caffe Users, jawad...@gmail.com

Did you get any solution? mine now gets stuck in local minima. It is oscillating for more than 20K iterations and not descending. The loss is around 20-40K.

Lê Hoàng Ân

unread,

Jul 22, 2016, 3:26:17 AM7/22/16

to skeptic.sisyphus, Caffe Users

Hi,

No progress yet, so you manage to reduce the loss to 20-40K already. Maybe I can take your advice. What've you done?

To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/d6c508b1-663a-444f-82f1-30024d9fd268%40googlegroups.com.

Message has been deleted

Sepideh Hosseinzadeh

unread,

Dec 12, 2016, 6:03:10 PM12/12/16

to Caffe Users, jawad...@gmail.com

Hi An,
Could you please tell us the steps we need to train this network? So I want to train this network on my own dataset, but I don't know how to change the trainval.prototxt and other files, what are the color depth label data ?

Many thanks in advance.

Reply all

Reply to author

Forward