Train size too big / Hidden Units

48 views
Skip to first unread message

Carson Boyd

unread,
Feb 9, 2014, 9:04:34 PM2/9/14
to ebl...@googlegroups.com
Hi,

I have recently been training various different MNIST architectures but three things that have come under my attention.

1.

The first being that when I want to train the full MNIST dataset of 60,000 the first round is the exact size but the first iteration will bring up a bigger number to be trained, such as 67420.
I have been using the sample MNIST provided to test this.

I have used the following lines to complete the 60,000 train size, but none have been able to explicitly keep the training size to the right amount.

epoch_size = 60000
train_size = 60000

I have also used the 50,000 number of samples and it will give me back the same sort of results, such as 56780.

I am not sure if I am doing something wrong but these are the only parameters that I can see being changed for the training size/epoch length.

This being said, would this affect my network as well as is it due to other things going on in the network that the extra size/length is being added?

2.

Because of this extra number of training samples, where i have decided to have a diaghessian period of 60000 the number is used in between epoch and not at the end as where i want it.

hessian_period = 60000

3.

Lastly, i have been playing around with various architectures but i'm still not sure about hidden units as in classifier_hidden.

For classifier_hidden it is 16 in the sample.

Does that mean that for all the layers that there are 16 hidden units or only the layers indicated with that variable.
I would assume that the layers would have there own input/output based on the architecture.

I would like to have different numbers of hidden units throughout the layers.

....


Thank you, hopefully these are proper questions (not already asked), i did try to look through the forums for the correct answers.
Carson.

soumith

unread,
Feb 9, 2014, 9:16:56 PM2/9/14
to ebl...@googlegroups.com
the exact size but the first iteration will bring up a bigger number to be trained, such as 67420.
By default, the training is balanced (equal number of samples shown per class), so that epoch number is basically [argmax of samples in each class] * nClasses

2. Set hessian_period to 67420 as a work-around

3. Does that mean that for all the layers that there are 16 hidden units or only the layers indicated with that variable.?
For only layers indicated with that variable.


--
You received this message because you are subscribed to the Google Groups "eblearn" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eblearn+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Carson Boyd

unread,
Feb 9, 2014, 11:33:10 PM2/9/14
to ebl...@googlegroups.com
Thankyou,

Helpful and quick,

Carson
Reply all
Reply to author
Forward
0 new messages