Docker Run for tensorflow/serving

370 views
Skip to first unread message

Cassie Leong

unread,
Jun 1, 2020, 11:55:03 PM6/1/20
to TensorFlow Extended (TFX)
Hi everyone, I've been having an issue when I run the docker to create the REST port for model serving that is generated from TFX trainer output. 
I run the following codes in the GCP Notebook but the cell never ends / successfully executed. It just takes a long time and I never see the end of it. I just wonder is it normal to take a long time to run or is there something that I missed? It always stops at this line "[evhttp_server.cc : 238] NET_LOG: Entering the event loop ..."  


!docker run -t --rm -p 8501:8501 \
   
-v  "/home/jupyter/.../saved_models/:/models/ea/1" \
   
-e MODEL_NAME=ea \
    tensorflow
/serving

2020-06-02 03:34:23.001881: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:203] Restoring SavedModel bundle.
2020-06-02 03:34:23.036911: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:152] Running initialization op on SavedModel bundle at path: /models/ea/1
2020-06-02 03:34:23.049433: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:333] SavedModel load for tags { serve }; Status: success: OK. Took 83552 microseconds.
2020-06-02 03:34:23.050415: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:105] No warmup data file found at /models/ea/1/assets.extra/tf_serving_warmup_requests
2020-06-02 03:34:23.051155: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: ea version: 1}
2020-06-02 03:34:23.053898: I tensorflow_serving/model_servers/server.cc:358] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2020-06-02 03:34:23.055473: I tensorflow_serving/model_servers/server.cc:378] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 238] NET_LOG: Entering the event loop ...

Irene Giannoumis

unread,
Jun 2, 2020, 12:00:18 AM6/2/20
to Cassie Leong, Pedram Pejman, TensorFlow Extended (TFX)
+Pedram Pejman to see if he can help with this

--
You received this message because you are subscribed to the Google Groups "TensorFlow Extended (TFX)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tfx+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfx/6cf19616-6bd9-4383-a5fe-fce6eb73eec9%40tensorflow.org.

Gautam Vasudevan

unread,
Jun 2, 2020, 12:39:06 AM6/2/20
to Irene Giannoumis, Cassie Leong, Pedram Pejman, TensorFlow Extended (TFX)
I think this is just an interaction between docker and the notebook. “Entering event loop” means the model server is up and ready to field requests.

I think you need to run the docker command with the -d switch to run the container in daemon mode so it completes and allows the next cell to run.


Cassie Leong

unread,
Jun 2, 2020, 12:53:48 AM6/2/20
to TensorFlow Extended (TFX)
Thanks for the reply, and yes, you are right. When I stop the cell and run !docker ps, the latest port has been served. 

When -d is included in the docker command lines, it successfully executed the cell and go on with the next one. 

My next issue is similar to another post where the inference data needs to be serialized using Base64. I wonder if there is any documentation about it? Thanks again!

On Tuesday, 2 June 2020 14:39:06 UTC+10, Gautam Vasudevan wrote:
I think this is just an interaction between docker and the notebook. “Entering event loop” means the model server is up and ready to field requests.

I think you need to run the docker command with the -d switch to run the container in daemon mode so it completes and allows the next cell to run.

On Mon, Jun 1, 2020 at 21:00 'Irene Giannoumis' via TensorFlow Extended (TFX) <t...@tensorflow.org> wrote:
+Pedram Pejman to see if he can help with this

On Mon, Jun 1, 2020 at 8:55 PM Cassie Leong <cassi...@lendlease.com> wrote:
Hi everyone, I've been having an issue when I run the docker to create the REST port for model serving that is generated from TFX trainer output. 
I run the following codes in the GCP Notebook but the cell never ends / successfully executed. It just takes a long time and I never see the end of it. I just wonder is it normal to take a long time to run or is there something that I missed? It always stops at this line "[evhttp_server.cc : 238] NET_LOG: Entering the event loop ..."  


!docker run -t --rm -p 8501:8501 \
   
-v  "/home/jupyter/.../saved_models/:/models/ea/1" \
   
-e MODEL_NAME=ea \
    tensorflow
/serving

2020-06-02 03:34:23.001881: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:203] Restoring SavedModel bundle.
2020-06-02 03:34:23.036911: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:152] Running initialization op on SavedModel bundle at path: /models/ea/1
2020-06-02 03:34:23.049433: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:333] SavedModel load for tags { serve }; Status: success: OK. Took 83552 microseconds.
2020-06-02 03:34:23.050415: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:105] No warmup data file found at /models/ea/1/assets.extra/tf_serving_warmup_requests
2020-06-02 03:34:23.051155: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: ea version: 1}
2020-06-02 03:34:23.053898: I tensorflow_serving/model_servers/server.cc:358] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2020-06-02 03:34:23.055473: I tensorflow_serving/model_servers/server.cc:378] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 238] NET_LOG: Entering the event loop ...

--
You received this message because you are subscribed to the Google Groups "TensorFlow Extended (TFX)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to t...@tensorflow.org.

--
You received this message because you are subscribed to the Google Groups "TensorFlow Extended (TFX)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to t...@tensorflow.org.

Gautam Vasudevan

unread,
Jun 2, 2020, 1:11:42 AM6/2/20
to Cassie Leong, TensorFlow Extended (TFX)

Cassie Leong

unread,
Jun 2, 2020, 1:28:22 AM6/2/20
to TensorFlow Extended (TFX)
Thanks againGautam, please allow me to ask the naive question, I presume that all the data features in type string need to be serialized using Base64, is that correct? The model served is using structured data with strings and passes through the Keras function API model. 


On Tuesday, 2 June 2020 15:11:42 UTC+10, Gautam Vasudevan wrote:

Gautam Vasudevan

unread,
Jun 2, 2020, 1:55:37 AM6/2/20
to Cassie Leong, TensorFlow Extended (TFX)
Without knowing details of your specific setup and code, generally speaking you will base64 encode requests that would otherwise be sent up as bytes (say a serialized structure, image data, etc.). So I think that you probably do want to base64 encode what you’re sending up based on what I think you’re saying.

Other types of data have native representations in JSON - how TF types map to JSON is documented here: 

It really depends on your serving function - you can see exactly what’s expected from your model using the saved_model_cli:
$ saved_model_cli show --dir /path/to/saved_model_dir --all

That will tell you what the serving function signature(s) looks like, and you can use that to determine what you need to send in your request to the model server.


To unsubscribe from this group and stop receiving emails from it, send an email to tfx+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfx/00444880-382a-4e53-b9f9-1ea7f1e9ee91%40tensorflow.org.

Cassie Leong

unread,
Jun 2, 2020, 7:48:57 AM6/2/20
to TensorFlow Extended (TFX)
Thank you for your guidance. 

I'm new to tfx and learning to create a kubeflow pipeline through Google AI Platform Pipeline. Though I've successfully created the pipeline, I'm trying to find the options to serve the saved_model to get an endpoint for prediction. I'm here to learn and thanks in advance! 

So far, I've tried 2 ways: 
1) Deploy to Google AI Platform Models, but the JSON sample data input doesn't take the format, it could be the data has to be serialized using Base64. 
2) Serving model through Docker. I'm looking to get REST API through model serving. I'm still struggling to get the API to return the prediction. 
 

!curl -d '{"inputs": {"examples": [{"inputs/0": ["A"], "inputs/1": ["B"]}]}}' \
   
-X POST http://localhost:8502/v1/models/ea:predict

{ "error": "JSON Value: {\n \"Country_Code_xf\": [\n \"US\"\n ],\n \"Project_Type_xf\": [\n \"Delivery\"\n ]\n} not formatted correctly for base64 data" }


$ saved_model_cli show --dir /path/to/saved_model_dir --all

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['examples'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: serving_default_examples:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['outputs'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict
WARNING:tensorflow:From /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.

Defined Functions:
  Function Name: '__call__'
    Option #1
      Callable with:
        Argument #1
          DType: list
          Value: [TensorSpec(shape=(None, 3), dtype=tf.float32, name='inputs/0'), TensorSpec(shape=(None, 8), dtype=tf.float32, name='inputs/1')]
        Argument #2
          DType: bool
          Value: True
        Argument #3
          DType: NoneType
          Value: None





On Tuesday, 2 June 2020 15:55:37 UTC+10, Gautam Vasudevan wrote:
Without knowing details of your specific setup and code, generally speaking you will base64 encode requests that would otherwise be sent up as bytes (say a serialized structure, image data, etc.). So I think that you probably do want to base64 encode what you’re sending up based on what I think you’re saying.

Other types of data have native representations in JSON - how TF types map to JSON is documented here: 

It really depends on your serving function - you can see exactly what’s expected from your model using the saved_model_cli:
$ saved_model_cli show --dir /path/to/saved_model_dir --all

That will tell you what the serving function signature(s) looks like, and you can use that to determine what you need to send in your request to the model server.

Gautam Vasudevan

unread,
Jun 2, 2020, 11:28:24 AM6/2/20
to Cassie Leong, TensorFlow Extended (TFX)
It’s pretty tough to debug what’s happening without a working example, but your model does take a string argument, and based on what you’re saying and the error, that implies a base64 encoded input.  Try taking the chunk of data it appears to expect (Country Code/project type) and base64 encoding it first, then passing it up as as row data.

{“instances”:  [{“b64”: “<base64 encoded string of your structured data>“}] }

Check out the resnet example I posted earlier, which does this with image data.

To unsubscribe from this group and stop receiving emails from it, send an email to tfx+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfx/e9113357-496f-439b-b4a5-5a49d5602e9b%40tensorflow.org.

Cassie Leong

unread,
Jun 3, 2020, 1:41:20 AM6/3/20
to TensorFlow Extended (TFX)
Thank you so much for your guidance. After playing around with the format, I am able to call for the prediction now. 

Thanks!


On Wednesday, 3 June 2020 01:28:24 UTC+10, Gautam Vasudevan wrote:
It’s pretty tough to debug what’s happening without a working example, but your model does take a string argument, and based on what you’re saying and the error, that implies a base64 encoded input.  Try taking the chunk of data it appears to expect (Country Code/project type) and base64 encoding it first, then passing it up as as row data.

{“instances”:  [{“b64”: “<base64 encoded string of your structured data>“}] }

Check out the resnet example I posted earlier, which does this with image data.

Cassie Leong

unread,
Jun 3, 2020, 2:49:45 AM6/3/20
to TensorFlow Extended (TFX)
I have another question in regards to Docker, how do I serve the model where the path and model are saved in Google Cloud Storage? 

Thanks! 

!docker run -t --rm -d -p 8502:8501 \
   
-v  "gs://hostedkfp-default-4hco57fcpj/tfx_pipeline_output/egress_access_pipeline/Trainer/model/92/serving_model_dir/:/models/ea2/1" \
   
-e MODEL_NAME=ea2 \
    tensorflow
/serving

Error:


docker
: Error response from daemon: invalid mode: /models/ea2/2.
See 'docker run --help'.


Gautam Vasudevan

unread,
Jun 3, 2020, 2:20:03 PM6/3/20
to Cassie Leong, TensorFlow Extended (TFX)
I think you'd have to mount it so it appears to be on the local file system. If you're doing this within a notebook, you'll probably have to create your own custom docker image that has all the tensorflow serving stuff in it and also mounts GCS to some local path you point your server to.

To unsubscribe from this group and stop receiving emails from it, send an email to tfx+uns...@tensorflow.org.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/tfx/2fb19a4e-9dfc-4683-bf06-d25297ebe5db%40tensorflow.org.
Reply all
Reply to author
Forward
0 new messages