BentoML and KFServing

Jeremy Lewi

unread,

May 4, 2020, 12:16:47 PM5/4/20

to kubeflow-discuss

Hi Folks,

Has anyone taken a look at BentoML and seen how it compares to KFServing?

It looks like docs were recently added about how to serve with BentoML.

https://www.kubeflow.org/docs/components/serving/bentoml/

Skimming BentoML's docs. It looks like BentoML is providing model servers. I'm guessing that overlaps with some of KFServing's functionality.

BentoML mentions microbatching support. Does KFServing support that today?

It looks like BentoML is aiming to support a number of backends e.g.

Docker, Kubernetes, KFserving, AWS Lambda, SageMaker, Azure and more

J

Chaoyu Yang

unread,

May 4, 2020, 4:36:22 PM5/4/20

to Jeremy Lewi, kubeflow-discuss

Hi Jeremy,

I'm the author of BentoML, happy to clarify those questions from my point of view.

BentoML focuses on turning trained ML models into API servers that are easy to deploy and with good performance. I think KFServing focuses more on the downstream deployment side, with features such as auto-scaling, A/B testing, MAB, monitoring, scale-to-zero, etc.

We are actually contributing documentation on integrating BentoML with KFServing here: https://github.com/kubeflow/kfserving/pull/800 BentoML fits in the KFServing workflow via InferenceService with pre-build image, as BentoML offers a way for data scientists to build API model server docker images that comes with performance optimizations and DevOps best practices built-in.

BentoML overlaps with the KFServing Model Server component in KFServing. The main differences are: BentoML has micro-batching, offline batch serving, and model management support; KFServing Model Server support Tensorflow V1 HTTP API format and has better integration with other KFServing components such as explainer and canary rollout.

As far as I know, KFServing itself does not provide micro batching. The tf-serving project does micro-batching and I believe it is being used in KFServing only when deploying TensorFlow models. It would be interesting to see a deeper integration between BentoML and KFServing Model Server if the KFServing team is open to that, would love to chat more.

Best,

Chaoyu

--
You received this message because you are subscribed to the Google Groups "kubeflow-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubeflow-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubeflow-discuss/CACijwiYE6DvaECkRfC6rqV7ZvFYEF4oVp7t6j%3De%2B2LfXYXo9rA%40mail.gmail.com.

Ellis Bigelow

unread,

May 4, 2020, 6:43:20 PM5/4/20

to Chaoyu Yang, Jeremy Lewi, kubeflow-discuss

Yeah +1 to this. We are exploring generic microbatching with a proxy, but I still see BentoML as a viable integration point with our custom image. As far as the KFServer python code goes, this is purely a stopgap in KFServing until better model servers come onto the market (i.e. TorchServe).

To view this discussion on the web visit https://groups.google.com/d/msgid/kubeflow-discuss/CACdQWLyPa%2BtFh5zSfhix7KLkQDk_w-s%2BWxqnMiPuS6OZ8yY%3DmA%40mail.gmail.com.