Is the fit_generator supposed to be slow?...
I made a data_generator so data batches can be feed - but extracting and processing takes time
and makes everything just become buggy.
pickle_safe = true, seem to work a bit bit still very buggy?...how is keras supposed to handle large dataset?
def train_generator(batch_size): while True: for input in train_files: #print input output = input.split("_") output[-1] = "output.h5" output = "_".join(output) #print output
h5f = h5py.File(numpy_train_input+'/'+input, 'r') train_input = h5f['train_input'][:] h5f.close()
h5f = h5py.File(numpy_train_output+'/'+output, 'r') train_output = h5f['train_output'][:] h5f.close()
train_input = train_input.reshape((batch_size,splits,total_frames_with_deltas,window_height,3)) train_input_list = np.split(train_input,33,axis=1)
for i in range(len(train_input_list)): train_input_list[i] = train_input_list[i].reshape(batch_size,45,8,3)
#print train_input_list[0].shape #print train_output.shape yield (train_input_list, train_output)
1000/1000 [==============================] - 12166s - loss: 3.5786 - categorical_accuracy: 0.1081 - val_loss: 4.2263 - val_categorical_accuracy: 0.0500
You are reading from disk every call to the generator which will incur an overhead especially if your reading pattern is random. I am not sure threading will buy you much here as the h5py lib i believe is like the GIL in that everything passes through the lib so you really dont get parallelism (you should try this though and hopefully prove me wrong). You could make sure your generator is multiprocess safe and try setting pickle_safe to true and see if that helps the h5py lib locking issue.
Also, the first epoch is always slow because your model is compiling. I am not sure what model.compile() does but at the first epoch, a keras model has a first look at the data input sizes (mainly the batch size) and has to build the GPU code from the underlying graph to begin execution. The time it takes to do this is the wait you are seeing on the first epoch.
hope that helps,isaac
--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users+unsubscribe@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/077bee62-a666-4c6c-9a04-98a09608347a%40googlegroups.com.