You’re correct, but AFAIK choice of measurement is irrelevant in this case as the time intervals are relatively large (ms)
I’d have used RDTSC but the machine I’m on has DVFS turned on and I was too lazy to change that…
The numbers I’m reporting aren’t from one RPC call, however. It’s an average over several invocations of the same RPC:
$ taskset -c 1 ./client -f ../data/video/__909tIOxbc_temp.mp4
video file: ../data/video/__909tIOxbc_temp.mp4
Press control-c to quit at any point
6 objects detected.
This request took 247.746 milliseconds
6 objects detected.
This request took 240.866 milliseconds
7 objects detected.
This request took 230.565 milliseconds
8 objects detected.
This request took 227.763 milliseconds
10 objects detected.
This request took 229.529 milliseconds
10 objects detected.
This request took 225.853 milliseconds
10 objects detected.
This request took 220.809 milliseconds
9 objects detected.
This request took 229.487 milliseconds
8 objects detected.
This request took 237.131 milliseconds
7 objects detected.
This request took 228.043 milliseconds
8 objects detected.
This request took 233.57 milliseconds
9 objects detected.
This request took 230.96 milliseconds
9 objects detected.
This request took 226.805 milliseconds
10 objects detected.
This request took 222.559 milliseconds
9 objects detected.
This request took 233.108 milliseconds
9 objects detected.
This request took 225.9 milliseconds
8 objects detected.
This request took 227.267 milliseconds
9 objects detected.
This request took 227.419 milliseconds
9 objects detected.
This request took 232.574 milliseconds
9 objects detected.
This request took 231.147 milliseconds
10 objects detected.
This request took 230.447 milliseconds
$taskset -c 5 ./server ../cfg/coco.data ../cfg/yolov3.cfg ../weights/yolov3.weights
Server listening on localhost:50051
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.904 milliseconds
0x7fff5d1dcc90 doDetection: took 66.514 milliseconds
0x7fff5d1dcc90Server took 66.55 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.804 milliseconds
0x7fff5d1dcc90 doDetection: took 66.263 milliseconds
0x7fff5d1dcc90Server took 66.288 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.762 milliseconds
0x7fff5d1dcc90 doDetection: took 66.797 milliseconds
0x7fff5d1dcc90Server took 66.825 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.768 milliseconds
0x7fff5d1dcc90 doDetection: took 66.182 milliseconds
0x7fff5d1dcc90Server took 66.209 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.78 milliseconds
0x7fff5d1dcc90 doDetection: took 66.244 milliseconds
0x7fff5d1dcc90Server took 66.271 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.772 milliseconds
0x7fff5d1dcc90 doDetection: took 66.102 milliseconds
0x7fff5d1dcc90Server took 66.13 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.768 milliseconds
0x7fff5d1dcc90 doDetection: took 66.107 milliseconds
0x7fff5d1dcc90Server took 66.135 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.788 milliseconds
0x7fff5d1dcc90 doDetection: took 71.868 milliseconds
0x7fff5d1dcc90Server took 72.365 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.795 milliseconds
0x7fff5d1dcc90 doDetection: took 66.157 milliseconds
0x7fff5d1dcc90Server took 66.186 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.815 milliseconds
0x7fff5d1dcc90 doDetection: took 66.255 milliseconds
0x7fff5d1dcc90Server took 66.284 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.746 milliseconds
0x7fff5d1dcc90 doDetection: took 66.489 milliseconds
0x7fff5d1dcc90Server took 66.516 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.902 milliseconds
0x7fff5d1dcc90 doDetection: took 68.414 milliseconds
0x7fff5d1dcc90Server took 68.452 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.758 milliseconds
0x7fff5d1dcc90 doDetection: took 66.615 milliseconds
0x7fff5d1dcc90Server took 66.643 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.811 milliseconds
0x7fff5d1dcc90 doDetection: took 66.339 milliseconds
0x7fff5d1dcc90Server took 66.368 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 24.714 milliseconds
0x7fff5d1dcc90 doDetection: took 70.671 milliseconds
0x7fff5d1dcc90Server took 70.704 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.805 milliseconds
0x7fff5d1dcc90 doDetection: took 66.269 milliseconds
0x7fff5d1dcc90Server took 66.299 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.836 milliseconds
0x7fff5d1dcc90 doDetection: took 66.73 milliseconds
0x7fff5d1dcc90Server took 66.755 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.791 milliseconds
0x7fff5d1dcc90 doDetection: took 65.679 milliseconds
0x7fff5d1dcc90Server took 65.707 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.826 milliseconds
0x7fff5d1dcc90 doDetection: took 66.242 milliseconds
0x7fff5d1dcc90Server took 66.271 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.804 milliseconds
0x7fff5d1dcc90 doDetection: took 67.864 milliseconds
0x7fff5d1dcc90Server took 67.893 milliseconds
doDetection: new requeust 0x7fff5d1dcc90
0x7fff5d1dcc90 GPU processing took 23.807 milliseconds
0x7fff5d1dcc90 doDetection: took 67.705 milliseconds
0x7fff5d1dcc90Server took 67.736 milliseconds
Re flatbuffers:
AFAIK, flatbuffers doesn’t transfer anything that is unused (fields can be marked deprecated).
If the field isn’t set to a value other than the default value, it isn’t transferred over the wire.
Instead such values are repopulated with the default value (specified in the schema) on the client/server side as needed..
Of course this only works for scalars. If you have need to deprecate a vector, then yes you’re kinda hosed...
Hmmm, I think the gRPC documentation should point to flatbuffers (which supports gRPC out of the box) as an alternative suited to low-latency cases (while making the drawbacks you mentioned clear).