Fully Convolutional Network in Torch (skip layer fusion)

3,110 views
Skip to first unread message

Jianxu Chen

unread,
Feb 26, 2016, 10:57:08 AM2/26/16
to torch7
Hello All,

Is there any known Torch implementation of Fully Convolutional Network with skip layer fusion? I am trying to implement a variation of FCN, but not sure how to skip layers and how to do the fusion. The only possible solution in my mind is to combine nngraph and the convolution layers in the nn module. Does anyone have any experience or suggestions?

Thanks in advance.
JC 

Jonghoon Jin

unread,
Feb 26, 2016, 11:07:13 AM2/26/16
to torch7 on behalf of Jianxu Chen
how about looking at this part and combing it with FCN?

--
You received this message because you are subscribed to the Google Groups "torch7" group.
To unsubscribe from this group and stop receiving emails from it, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at https://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.

Tushar N

unread,
Feb 26, 2016, 5:12:38 PM2/26/16
to torch7
Hi,

I apologize for partially hijacking your thread, but I was also trying to implement FCNs in torch. Even for the 32s version with just a single upsampling step, I was trying to directly translate the caffe model into a torch model (writing the layers, not importing from caffe). All the convolution/relu/pooling layers are straightforward to implement using nn modules, but there are two layers that I am slightly unsure about:

1.
layers { type: DECONVOLUTION name: 'upsample' bottom: 'score' top: 'bigscore' blobs_lr: 0 blobs_lr: 0 convolution_param { num_output: 21 kernel_size: 64 stride: 32 } }
Is this just the nn.SpatialFullConvolution() layer?

2.
layers { type: CROP name: 'crop' bottom: 'bigscore' bottom: 'data' top: 'upscore' }
Is there an equivalent torch layer for the crop layer described here?

What did you use to write up your version of the FCN?

Regards,
Tushar

On Friday, 26 February 2016 21:37:13 UTC+5:30, Jonghoon Jin wrote:
how about looking at this part and combing it with FCN?

Jianxu Chen

unread,
Feb 27, 2016, 12:29:59 AM2/27/16
to torch7
I think the answer for your first question is yes. That is just SpatialFullConvolution(). You may check out the conversation about the name at https://github.com/torch/nn/pull/405

For your second question, I still have no clue. 

JC

Jianxu Chen

unread,
Feb 27, 2016, 12:40:58 AM2/27/16
to torch7
Hi Jonghoon,

Thank you for referring the code. It seems like the shortcut function only support identical copy or 1x1 convolution. Do you know any way to do the copy while cropping, namely copy a portion to another layer? Thanks a lot.

JC 


On Friday, February 26, 2016 at 11:07:13 AM UTC-5, Jonghoon Jin wrote:
how about looking at this part and combing it with FCN?

Tushar N

unread,
Feb 27, 2016, 9:18:08 AM2/27/16
to torch7
Jianxu Chen, thanks for the link and for confirming.

I was also curious about the criterion that you plan on applying. The final output tensor is a bx21x500x500 image, with each channel containing log-probabilities of a specific class (nn.SpatialLogSoftMax() is handling that).
Is there a spatial equivalent to nn.ClassNLLCriterion() that can be used directly? I have a target tensor, which is a bx1x500x500 longTensor with 1-21(classes) as elements.

Regards,
Tushar

Jianxu Chen

unread,
Feb 27, 2016, 8:56:15 PM2/27/16
to torch7
The thing in my mind is nn.CrossEntropyCriterion But, I haven't implemented it exactly.

Also, I would appreciate if you can post under this thread when you find a nice way to implement cropping.

JC 

Jianxu Chen

unread,
Feb 28, 2016, 11:26:10 PM2/28/16
to torch7
Hi Tushar,

I realized that cropping is just two nn.Narrow layers. Then, nn.JoinTable or nn.CAddTable can be used to combine the results in a nngraph module.

JC

On Friday, February 26, 2016 at 5:12:38 PM UTC-5, Tushar N wrote:

Tushar N

unread,
Feb 29, 2016, 3:20:54 AM2/29/16
to torch7
I thought the caffe layer is slightly more elaborate. The bottom blob is the image whose dimensions need to be matched, so in a way, it's sort of dynamically cropping the image. With nn.Narrow, wouldn't you have to specify the exact start/end indices?

Of course, if you are working with just 500x500 images, then it isn't a problem. I was thinking of just using nn.SpatialZeroPadding with negative padding values (the docs say it just crops).

ginobilinie

unread,
Mar 31, 2016, 11:18:39 AM3/31/16
to torch7
Hi, which criterion do you use for FCN. I have read the criterion codes in torch, there is no one which can be directly used. Do you write a copy yourself?

在 2016年2月28日星期日 UTC-5下午11:26:10,Jianxu Chen写道:

Jianxu Chen

unread,
Mar 31, 2016, 11:22:25 AM3/31/16
to torch7 on behalf of ginobilinie
I used cudnn.SpatialCrossEntropyCriterion()  
And, I used nngraph + Narrow + ConcatTable to implement the skip+copy layer.  

JC

--
You received this message because you are subscribed to a topic in the Google Groups "torch7" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/torch7/D7-h8I_Z9ek/unsubscribe.
To unsubscribe from this group and all its topics, send an email to torch7+un...@googlegroups.com.

To post to this group, send email to tor...@googlegroups.com.
Visit this group at https://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.



--
Jianxu Chen
Department of Computer Science and Engineering
University of Notre Dame

zzz

unread,
Apr 19, 2016, 1:04:54 AM4/19/16
to torch7
@jianxu how do you do deconvolution or unsampling?

zzz

unread,
Apr 19, 2016, 1:11:32 AM4/19/16
to torch7
I see there is a weird name called nn.SpatialFullConvolution


On Friday, February 26, 2016 at 10:57:08 AM UTC-5, Jianxu Chen wrote:

Etienne Perot

unread,
May 3, 2016, 1:01:23 PM5/3/16
to torch7
Hello Jianxu Chen,

I'm trying something very similar with cudnn.SpatialCrossEntropyCriterion

My output is (batchSize,  class, height, width)

   5
  12
 480
 360
[torch.LongStorage of size 4]

My targets are (batchSize, height, width) :

   5
 480
 360
[torch.LongStorage of size 3]

Do i need to make my targets one-hot or something?

Thanks a lot in advance for your support,

Etienne


Le jeudi 31 mars 2016 17:22:25 UTC+2, Jianxu Chen a écrit :
I used cudnn.SpatialCrossEntropyCriterion()  
And, I used nngraph + Narrow + ConcatTable to implement the skip+copy layer.  

JC

Sergey Zagoruyko

unread,
May 3, 2016, 1:17:59 PM5/3/16
to torch7 on behalf of Etienne Perot
try doing: nn.utils.addSingletonDimension(targets,2):expandAs(output)

Jianxu Chen

unread,
May 3, 2016, 9:49:18 PM5/3/16
to torch7

I am also using cudnn.SpatialCrossEntropyCriterion. I think your current data is good to go. No need to convert to one-hot or add singleton dimension. 

Touqeer Ahmad

unread,
Jun 21, 2016, 1:17:49 PM6/21/16
to torch7
Hi All,

Can anyone please share the piece of code where you were able to get the skip layer fusion working using cropping, nngraph and Narrow.

Thanks,
Touqeer

Jianxu Chen

unread,
Jun 21, 2016, 1:27:02 PM6/21/16
to torch7 on behalf of Touqeer Ahmad
I cannot access my code on the lab server. But, here is an example:

L4=nn.SpatialConvolution(...) (L3)

Crop4=nn.Narrow(3,5,2*XX-4)(L4)
L4cp=nn.Narrow(4,5,2*XX-4)(Crop4) 

L5=nn.SpatialConvolution(...)(L4)

L6=nn.JoinTable(...){Crop4,L5}


Jianxu



--
You received this message because you are subscribed to a topic in the Google Groups "torch7" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/torch7/D7-h8I_Z9ek/unsubscribe.
To unsubscribe from this group and all its topics, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at https://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.

Touqeer Ahmad

unread,
Jun 21, 2016, 1:35:44 PM6/21/16
to torch7
Thank you Jianxu for the prompt response.

If I understand it correctly you are specifying the bottom and top layers explicitly same as how Caffe deals withe the blobs and layers.
Can you really do it in torch? sorry I am a newbie :)

They way I was defining my FCN is something like that i.e. first you define your architecture and then train it using SGD etc. But I was worrying like in Caffe the layer which later fused is actually being splitted i.e. you are making a copy of it but the way I am writing the architecture (below) each of these are being added in a pipeline but that would not be the case in an actual FCN. To sum it up, I want to confirm that in Torch you can specify the what would be the input layer and what would be the output for each specific module just like in Caffe.

net = nn.Sequential()                      
net:add(nn.SpatialFullConvolution(3, 16, 3, 3, 1, 1, 0, 0, 0, 0 )) 
net:add(nn.ReLU())                          
--[[
net:add(nn.SpatialFullConvolution(16, 16, 3, 3, 1, 1, 1, 1, 0, 0)) 
net:add(nn.ReLU())                          
net:add(nn.SpatialMaxPooling(2,2,2,2))     
net:add(nn.SpatialFullConvolution(16, 32, 3, 3, 1, 1, 1, 1, 0, 0)) 
net:add(nn.ReLU())                          
net:add(nn.SpatialFullConvolution(32, 32, 3, 3, 1, 1, 1, 1, 0, 0)) 
net:add(nn.ReLU())                          
net:add(nn.SpatialFullConvolution(32, 32, 3, 3, 1, 1, 1, 1, 0, 0)) 
net:add(nn.ReLU())                          
net:add(nn.SpatialMaxPooling(2,2,2,2))     
net:add(nn.SpatialFullConvolution(32, 64, 3, 3, 1, 1, 1, 1, 0, 0)) 
net:add(nn.ReLU())            
net:add(nn.SpatialFullConvolution(64, 64, 3, 3, 1, 1, 1, 1, 0, 0)) 
net:add(nn.ReLU())            
net:add(nn.SpatialFullConvolution(64, 64, 3, 3, 1, 1, 1, 1, 0, 0)) 
net:add(nn.ReLU())            
net:add(nn.SpatialMaxPooling(2,2,2,2))     
net:add(nn.SpatialFullConvolution(64, 512, 5, 5, 1, 1, 0, 0, 0, 0)) 
net:add(nn.ReLU())            
net:add(nn.SpatialFullConvolution(512, 1, 1, 1, 1, 1, 0, 0, 0, 0)) 
net:add(nn.ReLU())            
net:add(nn.SpatialFullConvolution(1, 1, 2, 2, 2, 2, 0, 0, 0, 0))  


Thanks,
Touqeer

Jianxu Chen

unread,
Jun 21, 2016, 5:10:40 PM6/21/16
to torch7 on behalf of Touqeer Ahmad
Your implementation is not the nngraph version.

You can check out this: https://github.com/torch/nngraph

By using nngraph, you can explicitly assign the input and output of each layer.

Jianxu


Message has been deleted

Touqeer Ahmad

unread,
Jun 22, 2016, 3:12:52 PM6/22/16
to torch7
Hi Jianxu,

Thanks much for your guidance.

I have redefined the CIFAR-10 example using nngraph, can you please have a quick look on the following and let me know if I am making anything wrong.
I want to be sure on a familiar small scale net before I move to FCN. The CIFAR-10 data is already loaded in the trainset tensor.

require 'nn';
require 'nngraph';

input = nn.Identity()()
L1 = nn.ReLU()(nn.SpatialConvolution(3,6,5,5)(input))
L2 = nn.SpatialMaxPooling(2,2,2,2)(L1)
L3 = nn.ReLU()(nn.SpatialConvolution(6,16,5,5)(L2))
L4 = nn.View(16*5*5)(nn.SpatialMaxPooling(2,2,2,2)(L3))
L5 = nn.ReLU()(nn.Linear(16*5*5,120)(L4))
L6 = nn.ReLU()(nn.Linear(120,84)(L5))
L7 = nn.ReLU()(nn.Linear(84,10)(L6))
L8 = nn.LogSoftMax()(L7)
net = nn.gModule({input},{L8})

criterion = nn.ClassNLLCriterion()

trainer = nn.StochasticGradient(net, criterion)
trainer.learningRate = 0.001
trainer.maxIteration = 5 -- just do 5 epochs of training.

trainer:train(trainset)


Best Regard,
Touqeer
- show quoted text -

Jianxu Chen

unread,
Jun 22, 2016, 3:29:06 PM6/22/16
to torch7 on behalf of Touqeer Ahmad
It looks good. But, it would be better to run it by feeding some random input to check if it really works.

Jianxu

Touqeer Ahmad

unread,
Jun 22, 2016, 7:36:12 PM6/22/16
to torch7
Hi Jianxu, Thank you for your continuous help! I tried 1 epoch with CIFAR and it worked.

I have now defined my small scale FCN and trying to put it to training. However, I am getting this error. 
"
mini-batch supported only
stack traceback:
"

Below is my network and how I am setting it, can you please point me in the right direction, thanks!

-- data and labels:
print(trainset.data:size())  -- size is 10x3x64x64 i.e. 10 examples of 64x64 RGB image patches
print(trainset.label:size()) -- 10x64x64 i.e. class labels for each example, for me only two classes


-- the network architecture using nngraph
input = nn.Identity()()
L1 = nn.ReLU()(nn.SpatialFullConvolution(3,16,3,3,1,1,0,0,0,0)(input))
L2 = nn.ReLU()(nn.SpatialFullConvolution(16,16,3,3,1,1,0,0,0,0)(L1))
L3 = nn.SpatialMaxPooling(2,2,2,2)(L2)
L4 = nn.ReLU()(nn.SpatialFullConvolution(16,32,3,3,1,1,0,0,0,0)(L3))
L5 = nn.ReLU()(nn.SpatialFullConvolution(32,32,3,3,1,1,0,0,0,0)(L4))
L6 = nn.ReLU()(nn.SpatialFullConvolution(32,32,3,3,1,1,0,0,0,0)(L5))
L7 = nn.SpatialMaxPooling(2,2,2,2)(L6)
L8 = nn.ReLU()(nn.SpatialFullConvolution(32,64,3,3,1,1,0,0,0,0)(L7))
L9 = nn.ReLU()(nn.SpatialFullConvolution(64,64,3,3,1,1,0,0,0,0)(L8))
L10 = nn.ReLU()(nn.SpatialFullConvolution(64,64,3,3,1,1,0,0,0,0)(L9))
L11 = nn.SpatialMaxPooling(2,2,2,2)(L10)
L12 = nn.ReLU()(nn.SpatialFullConvolution(64,512,5,5,1,1,0,0,0,0)(L11))
L13 = nn.ReLU()(nn.SpatialFullConvolution(512,1,1,1,1,1,0,0,0,0)(L12))
L14 = nn.ReLU()(nn.SpatialFullConvolution(1,1,2,2,2,2,0,0,0,0)(L13))
L14c1 = nn.Narrow(2,1,20)(L14)
L14c2 = nn.Narrow(3,1,20)(L14c1)
PS1 = nn.ReLU()(nn.SpatialFullConvolution(32,1,1,1,1,1,0,0,0,0)(L7))
Fused1 = nn.CAddTable()({L14c2,PS1})
UFused1 = nn.ReLU()(nn.SpatialFullConvolution(1,1,2,2,2,2,0,0,0,0)(Fused1))
UFused1c1 = nn.Narrow(2,1,34)(UFused1)
UFused1c2 = nn.Narrow(3,1,34)(UFused1c1)
PS2 = nn.ReLU()(nn.SpatialFullConvolution(16,1,1,1,1,1,0,0,0,0)(L3))
Fused2 = nn.CAddTable()({UFused1c2,PS2})
UFused2 = nn.ReLU()(nn.SpatialFullConvolution(1,1,2,2,2,2,0,0,0,0)(Fused2))
UFused2c1 = nn.Narrow(2,1,64)(UFused2)
UFused2c2 = nn.Narrow(3,1,64)(UFused2c1)
net = nn.gModule({input},{UFused2c2})

-- setting the criterion

criterion = cudnn.SpatialCrossEntropyCriterion
trainer = nn.StochasticGradient(net, criterion)
trainer.learningRate = 0.0001
trainer.maxIteration = 1 -- just do 1 epochs of training i.e. just pass these 10 examples only once.

-- putt to training
trainer:train(trainset)

Thanks and Regards,
Touqeer






On Wednesday, June 22, 2016 at 12:29:06 PM UTC-7, Jianxu Chen wrote:
It looks good. But, it would be better to run it by feeding some random input to check if it really works.

Jianxu

Jianxu Chen

unread,
Jun 22, 2016, 10:39:31 PM6/22/16
to torch7 on behalf of Touqeer Ahmad
That's true. SpatialCrossEntropy of cudnn is only for mini batch. To work around, you may just add a singleton dimension to your input, so that it will be treated as batch size=1.

Jianxu Chen

Touqeer Ahmad

unread,
Jun 23, 2016, 3:28:15 PM6/23/16
to torch7
Thanks Jianxu,

But my data tensor and label tensor are already 4D and 3D respectively. This is how I declared my tensors and then read data and label images. So this should be a batch of 10.

imagesAll = torch.Tensor(10,3,64,64)
labelsAll = torch.Tensor(10,64,64)

Regards,
Touqeer





On Wednesday, June 22, 2016 at 7:39:31 PM UTC-7, Jianxu Chen wrote:
That's true. SpatialCrossEntropy of cudnn is only for mini batch. To work around, you may just add a singleton dimension to your input, so that it will be treated as batch size=1.

Jianxu Chen

Jianxu Chen

unread,
Jun 23, 2016, 3:41:42 PM6/23/16
to torch7 on behalf of Touqeer Ahmad
Currently, I cannot access Torch on my machine. So, I am not able to test it for you. But, I would suggest you to run the model without cudnn to see if there is any issue in the model.

Jianxu

On Thu, Jun 23, 2016 at 12:28 PM, Touqeer Ahmad via torch7 <torch7+APn2wQeMGhyCFt6_LH_d_6nza...@googlegroups.com> wrote:
Thanks Jianxu,

But my data tensor and label tensor are already 4D and 3D respectively. This is how I declared my tensors and then read data and label images. So this should be a batch of 10.

imagesAll = torch.Tensor(10,3,64,64)
labelsAll = torch.Tensor(10,64,64)

Regards,
Touqeer





On Wednesday, June 22, 2016 at 7:39:31 PM UTC-7, Jianxu Chen wrote:
That's true. SpatialCrossEntropy of cudnn is only for mini batch. To work around, you may just add a singleton dimension to your input, so that it will be treated as batch size=1.

Jianxu Chen

zzz

unread,
Jun 24, 2016, 11:04:19 AM6/24/16
to torch7
Hi 

How do you handle dynamically cropping for different input sized images? 

Tristan Postadjian

unread,
Aug 24, 2016, 11:25:54 AM8/24/16
to torch7
Hi,

Sorry for exhuming but I am very much interested in your topic since I need to deal, during test time, with input of size different from the training samples size.
I am working with aerial images (RGB and infra-red) that are huge (testing with images of size about 7000x8000) compared to the training samples (64x64 or 128x128).
I got almost everything from the algorithm, but still have some questions :
1) What is the cropping step for ?
2) Is it necessary to have a 3D label tensor for training ? I am using 1D tensor (one label for each sample) for the non FCN version
3) I computed the output size through Touqeer's net (built with nngraph), stating that we have the 3x64x64 images from CIFAR. I don't understand how he can make it since there are actually too many layers to leave any information left : I computed an output size of about 3x3 at step L11, the 3rd maxpooling.
4) Should the L1-->L10 conv layers not be of build with nn.SpatialConvolution objects to perform convolutions and not deconvolution that are indeed done with nn.SpatialFullConvolution ?

Jordan Campbell

unread,
Aug 31, 2016, 5:41:05 PM8/31/16
to torch7
Hi,

I was wondering why the first parameters to narrow were '3', and '4' in the code snippet below? This parameter is essentially the channel that we are indexing into right? Why then '3' and '4'?

Could someone also possibly explain why we need the 'cropping' and the 'narrowing'? I understand that we have down sampled when we do the max pooling, which we then need to upsample with a SpatialFullConvolution, but I'm don't see how the cropping fits in.

Thank you,

Jordan

Jianxu Chen

unread,
Aug 31, 2016, 6:28:48 PM8/31/16
to torch7 on behalf of Jordan Campbell
(1) Dimension 3 and 4 are the actual dimension of width and height.

(2) Cropping is needed when no padding is applied in convolutions.

To unsubscribe from this group and all its topics, send an email to torch7+unsubscribe@googlegroups.com.

To post to this group, send email to tor...@googlegroups.com.
Visit this group at https://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.

Zhao Kai

unread,
Oct 12, 2016, 7:19:20 AM10/12/16
to torch7
Hi Tushar,

Have you ever solved the Crop-layer problem?

I'm now also trying to implement a naive FCN with single pass, But get problem with the Crop layer. Can you give some help ?

swamiviv

unread,
May 18, 2017, 12:14:45 AM5/18/17
to torch7
I am looking for a torch based Semantic Segmentation implementation. Was this solved ?


Jianxu Chen

unread,
May 18, 2017, 9:31:14 PM5/18/17
to torch7
Reply all
Reply to author
Forward
0 new messages