First run of inference graph is relatively slow.

2,507 views
Skip to first unread message

vigen.sa...@gmail.com

unread,
Jan 11, 2017, 2:09:56 AM1/11/17
to Discuss
When I do sess.run() first time after loading model graph it is relatively slow than next runs.
Fox example: first run is 1.2 sec, during second run it is 0.3 sec , during third it is 0.2 sec, eventually after 4-5 run it stabilizes.
I understand that tensorflow do some profile based optimizations during sess.run(), my question is how we can store the 
graph in optimized state in order to eliminate  problem with slow first run? 

Vikesh Khanna

unread,
Jan 11, 2017, 3:43:24 AM1/11/17
to vigen.sa...@gmail.com, Discuss
Although I am not familiar with the internals of TensorFlow, it would be reasonable to assume that the slowness in the first run probably stems from things like device memory copying, cache misses etc. 

You should ideally run the sessions in a long-running process. Amortized cost beyond the first few runs will be small and stable. I recommend using TensorFlow serving that is packaged with the official TensorFlow distribution. It allows you to run a TF graph as an RPC service. 

Thanks,

--
You received this message because you are subscribed to the Google Groups "Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss+unsubscribe@tensorflow.org.
To post to this group, send email to dis...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/discuss/e6297100-245e-4fc7-ae63-fd7118de9e7b%40tensorflow.org.



--
Vikesh Khanna,
Masters, Computer Science
Stanford University

Yaroslav Bulatov

unread,
Jan 11, 2017, 11:05:59 AM1/11/17
to Vikesh Khanna, Vigen Sahakyan, Discuss
There are also things like initializing memory allocators, and GPU-related initializations (if you use GPU). 1.2 seconds is not that long, it can take a couple of seconds just to run matmul on GPU the first time

Reply all
Reply to author
Forward
0 new messages