Best way to deploy a machine learning solution that scales automatically?

Jana Cavojska

unread,

Feb 25, 2021, 11:33:03 AM2/25/21

to gce-discussion

I created a virtual machine with pre-installed PyTorch for this purpose, and I see it's possible to manually add GPUs or change the number of CPUs/RAM later.

But what is the intended way for a cloud ML solution to scale automatically?

Is this (only) possible with the solutions listed under "ARTIFICIAL INTELLIGENCE", such as "AI Platform (Unified)"?

Or are there yet other solutions I've overlooked?

I don't suppose this should be done with Cloud Run...? (I can't imagine how to say via a Dockerfile that there are supposed to be GPUs attached).

Do I understand it correctly that the AI Platform / the other "ARTIFICIAL INTELLIGENCE" solutions are the way to go for scalable, standard tasks like image classification, but custom pre-/postprocessing code is currently a beta feature, so if it doesn't work, one should set up a virtual machine instance on the Compute Engine instead?

Thanks!

Sorry if this is not quite the right group - AI doesn't have its own, general group.

Fady (Google Cloud Platform)

unread,

Feb 25, 2021, 4:59:53 PM2/25/21

to gce-discussion

You are correct if you want to manage the infrastructure, you can create VMs or clusters in different ways. Though I believe you want a managed solution through AI platform. It seems it is just a matter of terminology. Maybe you meant using Distributed training . Though it is not clear if the worker nodes with this solution can automatically scale.

Jana Cavojska

unread,

Feb 25, 2021, 5:58:10 PM2/25/21

to gce-discussion

Thank you! I suppose in regard to automatic scaling, I was mostly thinking of predictions and varying request loads rather than training. For this purpose, I found this PyTorch tutorial.

I was just wondering if there were other AI solutions with GPUs and automated scaling someplace else than on the AI platform that I might have overlooked.

swaylan

unread,

Mar 3, 2021, 11:17:33 AM3/3/21

to gce-discussion

Currently, regarding AI solutions (with GPU's) in conjunctions with auto-scaling the only available options are online prediction, here are the ways it can be configured which revolve around node allocation and machine-type [1][2], and there's this article that shows a more high-level overview of its capabilities and implementation [3]. I hope this information helps to clarify.

[1] https://cloud.google.com/ai-platform/prediction/docs/overview#node_allocation_for_online_prediction

[2] https://cloud.google.com/ai-platform/prediction/docs/machine-types-online-prediction#scaling

[3] https://cloud.google.com/blog/products/ai-machine-learning/scaling-machine-learning-predictions

Reply all

Reply to author

Forward