How to optimally combine TFRecordDatasets with ImageDataGenerator

30 views
Skip to first unread message

Tuelle

unread,
Aug 17, 2018, 1:10:25 PM8/17/18
to Keras-users
Hi,
I want to do semantic segmentation with a 100 GB dataset of CT slices. I cannot load all data in memory, so that I want to handle the data as a TFRecordDataset. What is the most performant way to do online data augmentation with Keras ImageDataGenerator? I am feeding a dataset Iterator into the flow function, but up to now it is still about 30% slower than if the data is stored in NumPy array in memory. Due to the threaded online augmentation , I would have expected that it would not become the bottleneck in the training pipeline. Any tips?
Reply all
Reply to author
Forward
0 new messages