Best way to deploy a machine learning solution that scales automatically?

32 views
Skip to first unread message

Jana Cavojska

unread,
Feb 25, 2021, 11:33:03 AM2/25/21
to gce-discussion
I created a virtual machine with pre-installed PyTorch for this purpose, and I see it's possible to manually add GPUs or change the number of CPUs/RAM later.

But what is the intended way for a cloud ML solution to scale automatically?
Is this (only) possible with the solutions listed under "ARTIFICIAL INTELLIGENCE​", such as "AI Platform (Unified)"?
Or are there yet other solutions I've overlooked?
I don't suppose this should be done with Cloud Run...? (I can't imagine how to say via a Dockerfile that there are supposed to be GPUs attached).

Do I understand it correctly that the AI Platform / the other "ARTIFICIAL INTELLIGENCE" solutions are the way to go for scalable, standard tasks like image classification, but custom pre-/postprocessing code is currently a beta feature, so if it doesn't work, one should set up a virtual machine instance on the Compute Engine instead?

Thanks!
Sorry if this is not quite the right group - AI doesn't have its own, general group.

Fady (Google Cloud Platform)

unread,
Feb 25, 2021, 4:59:53 PM2/25/21
to gce-discussion
You are correct if you want to manage the infrastructure, you can create VMs or clusters in different ways. Though I believe you want a managed solution through AI platform. It seems it is just a matter of terminology. Maybe you meant using Distributed training . Though it is not clear if the worker nodes with this solution can automatically scale. 


Jana Cavojska

unread,
Feb 25, 2021, 5:58:10 PM2/25/21
to gce-discussion
Thank you! I suppose in regard to automatic scaling, I was mostly thinking of predictions and varying request loads rather than training. For this purpose, I found this PyTorch tutorial.
I was just wondering if there were other AI solutions with GPUs and automated scaling someplace else than on the AI platform that I might have overlooked.

swaylan

unread,
Mar 3, 2021, 11:17:33 AM3/3/21
to gce-discussion
Currently, regarding AI solutions (with GPU's) in conjunctions with auto-scaling the only available options are online prediction, here are the ways it can be configured which revolve around node allocation and machine-type [1][2], and there's this article that shows a more high-level overview of its capabilities and implementation [3]. I hope this information helps to clarify.

Reply all
Reply to author
Forward
0 new messages