what the parameters are and what they do in Solver Options

Caroline

unread,

Oct 1, 2019, 1:16:29 PM10/1/19

to DIGITS Users

I'd appreciate to receave as much detail as possible about what the parameters are and what they do (Training epochs, Snapshot interval, Validation interval, Random seed, Batch size, Blob format, Solver type, Base learning rate) in Solver Options when Let's rate a new image. (especially those in bold)

Thanks!

l_mb

unread,

Oct 1, 2019, 10:58:21 PM10/1/19

to DIGITS Users

Random seed - since computers generate random numbers somewhat non-randomly depending on where they start, this lets you choose the starting point so you can get repeat-ability in your randomness (eg initialization)

Batch size - take your optimization steps based on one example, all the examples, or a few of the examples. Usually a few examples gives you a good compromise between not stepping recklessly in the wrong direction and not taking forever. This number is how many you want to be "a few" for your problem.

Solver type - what optimization algorithm are you using to find a solution. There's a ton of resources for these online & pros/cons of each one. Try looking them up by name.

Base learning rate - this is the multiplier for the gradient of your loss function. It's the scaling of the size of the step you take toward the optimal solution. This can be adjusted over your epochs to slowly narrow in on your ideal solution using your solver options. (the solver options depend on the solver type - another reason to take some time reading up on them)

Epochs - how long do you want to train. try a big-ish number and then increase or decrease if your model needs more time or converges

Snapshot interval - how often do you want to save your progress? A snapshot is a file containing the model parameters at a certain epoch. If you have a big model, save sparingly - it'll eat up your hard drive

Validation interval - how often do you want to see how you're doing? This is how often Digits applies your model to your validation set to see how it's generalizing to unseen data.

This is just a quick summary, most of these have a lot of nuance so definitely take the time to read up on them, where to start, and how to adjust based on your problem.

Happy learning!

Caroline

unread,

Oct 2, 2019, 10:11:04 AM10/2/19

to DIGITS Users

Thanks, I forgot to ask about the Batch Accumulation, I'm not finding anything about :(

l_mb

unread,

Oct 3, 2019, 11:58:25 AM10/3/19

to DIGITS Users

batch accumulation literally accumulates gradients from batches. It's helpful when your data are very large and you can't get a full batch size on your GPU. You can split it in half or thirds and accumulate over 2 or 3 batch calulations before updating the gradient.

Reply all

Reply to author

Forward