From your answer I understand that the parameter is related to some initial memory allocation.
I was trying to ask about something else. Let me first clarify my premise and see if it makes sense
in the Kaldi context.
I am used to training models in tensorflow and pytorch.
When training with a GPU I usually see that the GPU cycle time is the same if I use 20% of the memory
or 95% of the memory - so I would usually increase my batchsize until I am close to full utilization of
the GPU memory to maximize my throughput (I usually update the params once every few mini-batches).
Does this make sense also in the Kaldi framework?
If it is so - my question was about how to achieve this,
(I am running a Kaldi train using the standard libri chain recipe:
kaldi/egs/librispeech/s5/local/chain/tuning/run_tdnn_1b.sh
In the train - I observe a constant usage of 50% of the GPU memory - I don't see the train process
grabbing more GPU memory along the way).