Re: bad argument #2 to '?'

1,049 views
Skip to first unread message

soumith

unread,
Aug 21, 2015, 11:03:45 AM8/21/15
to torch7 on behalf of ignorant
hmm, interesting. i'll try it with a fake tensor of the dataset of your size and see what happens.

On Fri, Aug 21, 2015 at 1:04 AM, ignorant via torch7 <torch7+APn2wQfp_liogVOVBN8445pZH...@googlegroups.com> wrote:
I defined _index as in the tutorial -
setmetatable(trainData, 
    {__index = function(t, i) 
                    return {t.data[i], t.label[i]} 
[]                end}
);

function trainData:size() 
    return self.data:size(1) 
end



On Friday, August 21, 2015 at 1:00:25 AM UTC-4, ignorant wrote:
Hi,

I'm new to torch. I ran through this excellent tutorial https://github.com/soumith/cvpr2015  and tried to do this on my data.

My data is 

{
  data : DoubleTensor - size: 1336x3x32x32
  size : function: 0x403620f0
  label : ByteTensor - size: 1336
}

I am starting with only two output classes at the moment (there will be more). I know this is very little training data and I will do image transformations once I get this working to generate more.

I added the _index functions and can access the data by calling itorch.image(trainData[10][1]). I took the net from the tutorial and modified it slightly to fit my data -

net = nn.Sequential()
net:add(nn.SpatialConvolution(3, 6, 5, 5)) -- 3 input image channel, 6 output channels, 5x5 convolution kernel
net:add(nn.SpatialMaxPooling(2,2,2,2))     -- A max-pooling operation that looks at 2x2 windows and finds the max.
net:add(nn.SpatialConvolution(6, 16, 5, 5))
net:add(nn.SpatialMaxPooling(2,2,2,2))
net:add(nn.View(16*5*5))                    -- reshapes from a 3D tensor of 16x5x5 into 1D tensor of 16*5*5
net:add(nn.Linear(16*5*5, 120))             -- fully connected layer (matrix multiplication between input and weights)
net:add(nn.Linear(120, 84))
net:add(nn.Linear(84, 2))                   -- 2 is the number of outputs of the network (in this case, 10 digits)
net:add(nn.LogSoftMax())                     -- converts the output to a log-probability. Useful for classification problems

When I try to run trainer:train(trainData), I get the following -
# StochasticGradient: training	
bad argument #2 to '?' (out of range at /home/juliette/torch/pkg/torch/generic/Tensor.c:853)
stack traceback:
	[C]: at 0x7f1d072cb700
	[C]: in function '__index'
	...tte/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:58: in function 'forward'
	...te/torch/install/share/lua/5.1/nn/StochasticGradient.lua:35: in function 'f'
	[string "local f = function() return -- do the trainin..."]:2: in main chunk
	[C]: in function 'xpcall'
	/home/juliette/torch/install/share/lua/5.1/itorch/main.lua:177: in function </home/juliette/torch/install/share/lua/5.1/itorch/main.lua:143>
	/home/juliette/torch/install/share/lua/5.1/lzmq/poller.lua:75: in function 'poll'
	.../juliette/torch/install/share/lua/5.1/lzmq/impl/loop.lua:307: in function 'poll'
	.../juliette/torch/install/share/lua/5.1/lzmq/impl/loop.lua:325: in function 'sleep_ex'
	.../juliette/torch/install/share/lua/5.1/lzmq/impl/loop.lua:370: in function 'start'
	/home/juliette/torch/install/share/lua/5.1/itorch/main.lua:344: in main chunk
	[C]: in function 'require'
	[string "arg={'/home/juliette/.ipython/profile_torch/s..."]:1: in main chunk


I have been staring at this for two days and can't find the reason that would explain why the demo works but mine fails. I'm sure it is something trivial but I can't see it.

I would appreciate any pointers on debugging this.
Thanks,

--
You received this message because you are subscribed to the Google Groups "torch7" group.
To unsubscribe from this group and stop receiving emails from it, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at http://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.

Lovekesh Vig

unread,
Sep 15, 2015, 2:11:58 PM9/15/15
to torch7
Did you have any luck solving this problem? Im getting the same error

Thanks,

Lovekesh


On Friday, August 21, 2015 at 8:33:45 PM UTC+5:30, smth chntla wrote:
hmm, interesting. i'll try it with a fake tensor of the dataset of your size and see what happens.

alban desmaison

unread,
Sep 16, 2015, 3:59:20 AM9/16/15
to torch7
Did you tried to update your torch and nn packages?

Otherwise can you give the exact error message you get since this one points to old code that does not exist in the current version of torch/nn.

Muge Ersoy

unread,
Dec 4, 2015, 6:37:55 AM12/4/15
to torch7
I am receiving the same error with my prediction script

/home/ubuntu/torch/install/bin/luajit: bad argument #2 to '?' (end index out of bound)

stack traceback:

[C]: at 0x7f6f48e36160

[C]: in function '__index'

...u/torch/install/share/lua/5.1/cunn/DataParallelTable.lua:527: in function '_distributeTensorRecursive'

...u/torch/install/share/lua/5.1/cunn/DataParallelTable.lua:282: in function 'updateOutput'

/home/ubuntu/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'

guess.lua:35: in main chunk

[C]: in function 'dofile'

...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk

[C]: at 0x00406670




here is my script

  



require 'torch'
require 'cutorch'
require 'nn'
require 'dpnn'
require 'cunn'
require 'cudnn'
require 'dp'
require 'image'


local loadSize   = {3, 256, 256}
local sampleSize = {3, 224, 224}

im = image.load('/path/to/image/5483202298.jpg'):type('torch.FloatTensor'):contiguous() 
im=im*255
oH = sampleSize[2]
oW = sampleSize[3]
iW = im:size(3)
iH = im:size(2)
w1 = math.ceil((iW-oW)/2)
h1 = math.ceil((iH-oH)/2)
out = image.crop(im, w1, h1, w1+oW, h1+oW)

  for i=1,3 do -- channels
      if mean then out[{{i},{},{}}]:add(-mean[i]) end
      if std then out[{{i},{},{}}]:div(std[i]) end
  end
  
  
if out:dim() == 3 then
  img = out :view(1, out:size(1), out:size(2), out:size(3))
end

model = 'model_4.t7'
m = torch.load(model)
predictions = m:forward(img:cuda())
print(predictions)


alban desmaison

unread,
Dec 4, 2015, 8:18:38 AM12/4/15
to torch7
First your version of torch is not up to data with the latest, you may want to update it.

Second
Where does your model come from?
You have a DataParallelTable in it, why? You can use it to speedup training on multiple gpus but why keep it for testing?
If you have a DataParallelTable working with 2 GPUs for example, it expect a number of elements in the input batch that is a multiple of 2. Here you give it only 1.

swapnee...@djsce.edu.in

unread,
Dec 9, 2016, 5:50:18 AM12/9/16
to torch7
For future reference to this issue - as far as I know, it is caused by a training set not being large enough. Refer this link : https://github.com/jcjohnson/torch-rnn/issues/135


On Friday, August 21, 2015 at 8:33:45 PM UTC+5:30, smth chntla wrote:
hmm, interesting. i'll try it with a fake tensor of the dataset of your size and see what happens.
Reply all
Reply to author
Forward
0 new messages