Async predict requests for > 20k users

69 views
Skip to first unread message

Julie Alice Skøien

unread,
Jan 18, 2022, 11:07:23 AM1/18/22
to cloud-recommendations-users
Hi!

I've been using Cloud Retail and Recommendations AI since last November for a customer project.
In short, we're sending personalised newsletters with recommended products (Google Merchant Center) for each user who signed up for receiving emails.

I have a file consisting a list with user_ids and visitor_ids stored as tuples. For every tuple in that list, I send the a prediction request and return the associated recommended products. Later, the list is updated with the users' recommended products and I can update our CRM system in regards to these new properties. All well and good.

However, this is done in Cloud Functions, and I saw no problem with this when the list had 4000 users. When it was scaled up to having 25 000 users the Cloud Function times out! It's not even half way after the 9 minutes.

I, therefore, want to rewrite the program and make the predictions run asynchronously and be able to run a bunch of them in parallel.

As of now, I've tried changing the code to use PredictionServiceAsyncClient. What's weird is that our new code works on and off. Mostly, it returns:

raise exceptions.from_grpc_error(rpc_error) from rpc_error
google.api_core.exceptions.ServiceUnavailable: 503 Getting metadata from plugin failed with error: None could not be converted to unicode

Then I might execute the same code again, and it runs successfully. I can't seem to find any consistency in when this error is raised and not.

I've painted myself into a corner and I find very few, in fact no, examples of how this client are or should be used. Should it even be possible to run parallel requests on a PredictionServiceAsyncClient? I'm using a service account for local testing and no specific headers are set. I initiate the async client once.

And for what it's worth: I'm not very experienced with either Python nor Google Cloud. I might be overcomplicating things, but I was really hoping this was possible to solve with the async client like I'm trying to do now.

More examples of Recommendations AI usage are much wanted!


Nicholas Edelman

unread,
Jan 19, 2022, 4:36:24 PM1/19/22
to Julie Alice Skøien, Eric Larson, cloud-recommendations-users
@Eric Larson to help triage.

--
You received this message because you are subscribed to the Google Groups "cloud-recommendations-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-recommendatio...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-recommendations-users/8cc96916-a78e-4a0f-885d-8d08d328b3d1n%40googlegroups.com.

Eric Larson

unread,
Jan 21, 2022, 7:22:31 PM1/21/22
to Nicholas Edelman, Julie Alice Skøien, cloud-recommendations-users
I'm not really sure how well PredictionServiceAsyncClient would work.
What cloud function are you calling and how?  9 minutes for 10,000 calls doesn't sound too bad if you're doing that serially - that's what around 20QPS?
I probably wouldn't worry about trying to parallelize it if you're only talking 25K calls.  But if you're needing to do hundreds of thousands of millions then it may make sense to try to speed things up.

Depending on how you're making the calls you could either multi-thread it, or it may be easier to split the dataset into chunks and then just create a bunch of processes that each deal with a chunk.  That's probably going to be easier than trying to write multithreaded code.  
If they're hitting our api directly you'd just need to raise the per user & QPS limits.  If you're going through a GCF then you'd also need to do that, but the GCF itself will also probably need more resources allocated.
I would probably recommend not using a GCF for this and just hitting our API directly.

So I think the basic idea would be something like:
  1. Read email list and break into x # of batches
  2. exec a script that handles each batch, calling Retail API directly, and writing the output somewhere (database?), or pass it off to another script to do something with the results
That's how I'd hack it to go faster anyway...




 

Julie Alice Skøien

unread,
Jan 24, 2022, 11:16:54 AM1/24/22
to cloud-recommendations-users
I managed to get around the problem with PredictionAsyncClient after all!


I'm not really sure how well PredictionServiceAsyncClient would work.
What cloud function are you calling and how?  9 minutes for 10,000 calls doesn't sound too bad if you're doing that serially - that's what around 20QPS?
I probably wouldn't worry about trying to parallelize it if you're only talking 25K calls.  But if you're needing to do hundreds of thousands of millions then it may make sense to try to speed things up.


I could see in my logs that it was the predict() in combination with another function that puts the results into a list that took over 20 minutes. Neither the logic in the listing or the predict() function should take that long time, but I guess the combination of these two was really time consuming.

Depending on how you're making the calls you could either multi-thread it, or it may be easier to split the dataset into chunks and then just create a bunch of processes that each deal with a chunk.  That's probably going to be easier than trying to write multithreaded code.  
If they're hitting our api directly you'd just need to raise the per user & QPS limits.  If you're going through a GCF then you'd also need to do that, but the GCF itself will also probably need more resources allocated.
I would probably recommend not using a GCF for this and just hitting our API directly.

So I think the basic idea would be something like:
Read email list and break into x # of batches
exec a script that handles each batch, calling Retail API directly, and writing the output somewhere (database?), or pass it off to another script to do something with the results
That's how I'd hack it to go faster anyway...

I had already started writing this chunk logic you refer to. However, I wasn't aware that each process needed its own client initiation. The error I received (see first post) was due to this. Everything worked as soon as the client was initialised (client = retail.PredictionServiceAsyncClient(credentials=credentials)) inside each batch. 

Thanks for your comprehensive response! I guess it is a confirmation that my current solution is ok. 
Reply all
Reply to author
Forward
0 new messages