4096/4096 [==============================] - 187s - loss: 4.9220 - acc: 0.5042
for a model without BN, but with BN I get this: (578 seconds, 3 times longer):
4096/4096 [==============================] - 578s - loss: 0.7414 - acc: 0.5386
What's happening there, keeping in mind it's on a CPU (I cannot test what happens on a GPU for now). The first model is the same but without all the BN layers.
My model:
model.add(Convolution2D(32, 11, 11, subsample=(4,4),input_shape=(3,227,227)))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Convolution2D(64, 5, 5, subsample=(2,2)))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Convolution2D(64, 3, 3))
model.add(PReLU())
model.add(BatchNormalization())
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(Flatten())
model.add(Dense(400))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(Dense(400))
model.add(PReLU())
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(Dense(3, activation='softmax'))