Tensorflow Data API

27 views

Skip to first unread message

Stephen O'Neill

unread,

Feb 18, 2022, 2:27:36 PM2/18/22

to Discuss

Hey gang,

I'm searching for more detailed information on Tensorflow's mechanics for getting data from main memory into the GPU VRAM - when does this happen, what code controls it, when is tensorflow done with that memory segment, and so forth.

My ultimate goal is of course to speed up model training, I was thinking along the following track - if we have a few memory maps which are overwritten in-place after each batch of data is processed instead of dynamically allocated wherever the training dataset happens to put them, maybe combined with some fancy kernel paging it might speed things up? I assume that in the majority of training cases, once the first layer of a neural network has used the input data batch, we could begin swapping it out while the remaining layers + backprop are occurring (though I might need to double check my math, can't remember if the last layer of backprop uses the pixel values or not).

I assume the RAM->VRAM transfers are all controlled by the actual CUDA, not the python interface layers at all (like keras.engine.data_utils.GeneratorEnqueuer or the DataAdapter classes), but I was hoping someone here might be able to chime on whether this idea has any merit, and if so how it could be accomplished in Tensorflow, or maybe just if anyone has a more detailed abstraction they could share/chat about for what Tensorflow is doing under the hood of its data API.

Thanks!

Steve O'Neill

Reply all

Reply to author

Forward

0 new messages