Hi,
I noticed that Tensorflow is much slower than caffe.
I have a faster rcnn model with a VGG16 features extractor with caffe. This model is based on this repository : https://github.com/rbgirshick/py-faster-rcnn/
, input images are 1400x2000.
I have another network, made with tensorflow, based on this google blog https://cloud.google.com/blog/big-data/2017/06/training-an-object-detector-using-cloud-machine-learning-engine
which is a faster rcnn with a resnet101 features extractor with input images size 600x1000.
Based on the figure 7 from this article https://arxiv.org/abs/1611.10012
, with the same configuration, faster-rcnn with resnet101 should be faster than faster-rcnn with VGG16. In my case images for the caffe model are ~ 4 times bigger so the Tensorflow faster rcnn with resnet101 should infere much faster that the caffe network. But I can’t undestand why my inference time are both ~2secs
Is it normal that Tensorflow is so slow ?
I am running inferences on Google compute server.
OS : Ubuntu 1604
GPU : Tesla K80
NVIDIA driver : Driver Version: 375.66
CPU : 8 CPU Intel(R) Xeon(R) CPU @ 2.60GHz
Code used for tensorflow inference (the code is based on the notebook found in the tensorflow/models repo):
model_ckpt = "output_inference_graph.pb"
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(model_ckpt, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
config = tf.ConfigProto(allow_soft_placement = True)
with detection_graph.as_default():
with tf.Session(graph=detection_graph, config=config) as sess:
image = Image.open(image_path)
(im_width, im_height) = image.size
image_np = np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
# Each box represents a part of the image where a particular object was detected.
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
scores = detection_graph.get_tensor_by_name('detection_scores:0')
classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
# Actual detection.
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
After timing operations, the `sess.run` is taking ~2s (i don't considere the reading graph time which is also ~2s).
--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@tensorflow.org.
To post to this group, send email to dis...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/discuss/a3b78be5-9185-4a9a-bcba-2a855f742fb2%40tensorflow.org.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+u...@tensorflow.org.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@tensorflow.org.
To post to this group, send email to dis...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/discuss/113924c0-4196-473f-8950-b73009e21d21%40tensorflow.org.