model:add(nn.SparseLinear(inputSize, 1000))
model:add(nn.SoftMax())
criterion = nn.MSECriterion()
trainer = nn.StochasticGradient(model, criterion)
trainer.learningRate = 0.01
trainer.maxIteration = 300
First, I take my 50K vector and transform it in a 600K vector, where 1 is set when the value exists in the reference vector and 0 otherwise:
so for that example I should get 50K ones and 550K zeros.
For now, I'm taking 10'000x1 tensors from that vector that I "sparsify", which gives me unpredictable sized tensors.
As label, I take 12'000x1 tensors: the input and the next 2000 values. I do that for the whole 50K->600K sample vector, which will give me a few dozens of training examples.
Here are my questions:
1. Is the logic making sense to anyone? I'm not confident a 100% that this would be the best way to approach the problem.
2. If I want several layers in my NN, should I still use SparseLinear?
3. What is the best practice in terms of number of hidden neurons and number of layers?
4. Do I have to normalize my data in any way, knowing I'm dealing only with 0s and 1s?
5. How can I manage to get an output layer indicating me the indexes of the predicted values that will be existing (turned to 1) compared to my reference layer?
6. How can I manage to create a training set and use the Stochastic gradient train function: trainer:train(trainset)
I know there is A LOT going on here, but I can't manage to easily find help everywhere else so this torch7 is a little bit of my last hope.
If something is unclear or I'm lacking in explication, don't hesitate to ask.
Thanks a lot in advance