SparseLinear nn to handle large inputs and large output for a prediction problem

17 views
Skip to first unread message

Solène Buet

unread,
Jun 23, 2016, 6:35:41 AM6/23/16
to torch7
Dear all,

I'm pretty new to the magic of torch7 and seek your help/advice for a problem of mine.

Context: I am working on a prediction problem. We observed a certain pattern in our values and would like to predict the next values we could get over time.
We have a reference vector with almost 600K values (unique values) and as a dataset, vectors containing set of these values. Let's say on 120secs we received 50K of these values, this is one vector we have to create our dataset (we have many other vectors). The idea would be that if we observed maybe the first 60sec and get 30K of these values, we would like to be able to predict the next ones, which we know will be part of the reference vector (the set of all possible values/categories).

So basically working with large inputs, and expecting large outputs.
My current idea and implementation is using SparseLinear with an input size of 600K.

model = nn.Sequential()  
model:add(nn.SparseLinear(inputSize, 1000))
model:add(nn.SoftMax())
criterion = nn.MSECriterion()
trainer
= nn.StochasticGradient(model, criterion)
trainer.learningRate = 0.01
trainer.maxIteration = 300

First, I take my 50K vector and transform it in a 600K vector, where 1 is set when the value exists in the reference vector and 0 otherwise: 
so for that example I should get 50K ones and 550K zeros.
For now, I'm taking 10'000x1 tensors from that vector that I "sparsify", which gives me unpredictable sized tensors. 
As label, I take 12'000x1 tensors: the input and the next 2000 values. I do that for the whole 50K->600K sample vector, which will give me a few dozens of training examples.

Here are my questions:
1. Is the logic making sense to anyone? I'm not confident a 100% that this would be the best way to approach the problem.
2. If I want several layers in my NN, should I still use SparseLinear? 
3. What is the best practice in terms of number of hidden neurons and number of layers?
4. Do I have to normalize my data in any way, knowing I'm dealing only with 0s and 1s?
5. How can I manage to get an output layer indicating me the indexes of the predicted values that will be existing (turned to 1) compared to my reference layer?
6. How can I manage to create a training set and use the Stochastic gradient train function: trainer:train(trainset)

I know there is A LOT going on here, but I can't manage to easily find help everywhere else so this torch7 is a little bit of my last hope.
If something is unclear or I'm lacking in explication, don't hesitate to ask.

Thanks a lot in advance



Solène Buet

unread,
Jun 27, 2016, 5:57:20 AM6/27/16
to torch7
Any idea anyone? Thanks in advance.
Reply all
Reply to author
Forward
0 new messages