Customer Match User Lists best way to refresh them

178 views
Skip to first unread message

Daniel Vasilan

unread,
Feb 7, 2024, 4:09:19 AM2/7/24
to Google Ads API and AdWords API Forum
We are building a POC in order to see how we can use Offline User Data jobs to manage User Lists for Customer Match (Python, google-ads library).

We create several lists and use offline jobs to match customers based on their hashed e-mail addresses.
But then, we'll have to periodically update these lists: some e-mails will be removed, others will added, others will switch from one list to another (remove+add).

Question: What is the best way to achieve it?
Based on your documentation, we cannot mix create and remove operations in the same offline job.
Also, we cannot have concurrently running jobs, one for removal and one for adding new data on the same user list.
We cannot have them run in sequence, since there is no guarantee on when the other job is finishing and cannot wait for it, because of the extremely long duration of these processes.
We understand a remove_all data operation followed by create operations can be added in the same Job, so we can reset and refresh the lists every time. Just wanted to assure that's the best way to implement it and ask you if there are any risks associated with this approach.

If we go this route, will we have gaps in the data availability? 
Will the remove_all happen, list will be empty and then matched in several hours? 
Will the Ads processes consuming data from these list be impacted for the full duration of the job execution?

Thanks!

Google Ads API Forum Advisor

unread,
Feb 7, 2024, 9:04:39 AM2/7/24
to dd.va...@gmail.com, adwor...@googlegroups.com
Hi,

Thank you for reaching out to the Google Ads API support team.

Please find the below responses for your queries:

1) Will the remove_all happen, list will be empty and then matched in several hours? 

Note that removeAll operation removes all previously provided data. This is only supported for Customer Match.

2) Will the Ads processes consuming data from these list be impacted for the full duration of the job execution?

Kindly note that the processing time for offline user data jobs can vary depending on the size and complexity of the data, the availability of system resources, and any unexpected errors.

Also, avoid simultaneously running multiple OfflineUserDataJob processes that modify the same user list (that is, multiple jobs whose CustomerMatchUserListMetadata.user_list point to the same resource name). You may refer to the Best practices guide when designing your Customer Match integration. Also, you may follow Troubleshoot customer list issues which will help you troubleshoot low match rates and common errors that might come up in your Customer Match integration. Hope this helps. Let us know if you have any further queries.
 
This message is in relation to case "ref:!00D1U01174p.!5004Q02rytff:ref"

Thanks,
 
Google Logo Google Ads API Team


Daniel Vasilan

unread,
Feb 8, 2024, 4:01:12 AM2/8/24
to Google Ads API and AdWords API Forum
Thanks for your reply. It doesn't answer my questions, so I'll try to restructure them

1. Based on your answer, I understand the usage of remove_all is the only way to achieve what we want (refresh of a user list).
If that's not correct, please let me know

2. We want to understand what happens with an existing User List from the moment we start running an offline job containing remove_all + create operations on it.
I assume the remove_all operation will run first, so the user list will be empty as soon as this operation completes. 
Then, the create operations will add the accounts to the list and match them. As we know, the full process can take many hours (days, too - based on our tests).
Our question is: 
What is the impact of this on the processes that uses the data from this list? Will they find an empty or partially loaded user list while the job is executing?
We want to asses the risks and implications, before moving on with this solution

Thanks for the links for the documentation.
We went through them many times, but we didn't find answers for the above questions. 

Thanks,
Daniel

Google Ads API Forum Advisor

unread,
Feb 8, 2024, 8:40:39 AM2/8/24
to dd.va...@gmail.com, adwor...@googlegroups.com
Hi Daniel,

Thank you for getting back to us.

To update your lists with the latest data, it is generally more efficient to append or remove individual users, rather than removing all users from the list and uploading them from scratch. I would suggest you refer to the Manage customer list document for more information.

To remove all users from a list, set remove_all to true in an OfflineUserDataJobOperation, then issue a RunOfflineUserDataJob request with the resource name associated with the remove_all operation.
Note that when a remove_all operation is included, it must be the first operation in a job. If not, then running the job returns an INVALID_OPERATION_ORDER error. To completely replace the members of a user list with new members, order the operations in AddOfflineUserDataJobOperationsRequest in this sequence.

To replace the members of a list in the API you'll want to do the following: 
  • To completely replace the members of a user list with new members, order the operations in AddOfflineUserDataJobOperationsRequest in this sequence:
    • Set remove_all to true in an OfflineUserDataJobOperation.
    • For each new member, add a create operation setting their UserData in an OfflineUserDataJobOperation.
  • When you run your job, the Google Ads API will first mark all current members of the list for removal, and then apply the create operations.
Also, note that remove_all operations are executed hourly, and could run for up to 24 hours. Hope that helps. Let me know if you have any further questions.

Daniel Vasilan

unread,
Feb 8, 2024, 10:06:28 AM2/8/24
to Google Ads API and AdWords API Forum
Hi,


"To update your lists with the latest data, it is generally more efficient to append or remove individual users, rather than removing all users from the list and uploading them from scratch. I would suggest you refer to the Manage customer list document for more information."

Well, normally I would agree with this approach.
I've explained already where the problem is:
"Based on your documentation, we cannot mix create and remove operations in the same offline job.
Also, we cannot have concurrently running jobs, one for removal and one for adding new data on the same user list.
We cannot have them run in sequence, since there is no guarantee on when the other job is finishing and cannot wait for it, because of the extremely long duration of these processes."
So, how should we design an append and remove process ? 

"Also, note that remove_all operations are executed hourly, and could run for up to 24 hours."
I didn't find this in your docs (maybe a link where this info is found will help). 
Can you elaborate on why is it running 24h? I can hardly understand the match process duration, but is the remove_all doing more than what the name suggests?


"Hope that helps. Let me know if you have any further questions."
Yes, I had and I'll repeat it:

"What is the impact of this on the processes that uses the data from this list? Will they find an empty or partially loaded user list while the job is executing?
We want to asses the risks and implications, before moving on with this solution"

Thanks!

Google Ads API Forum Advisor

unread,
Feb 8, 2024, 2:33:11 PM2/8/24
to dd.va...@gmail.com, adwor...@googlegroups.com

Hi,

Thank you for getting back to us.

Kindly find answers below for your queries.

  1. So, how should we design an append and remove process ? What is the impact of this on the processes that use the data from this list? Will they find an empty or partially loaded user list while the job is executing?
  • You would be required to use the removeAll operation to remove all your previously provided data and then add the latest data by utilizing the create operation.
  • In the removeAll operation, there are two pipelines that run when a remove_all is requested: -- a pipeline that runs a PURGE operation, and another that handles the ADD/REMOVE operations. The first pipeline runs as a daily job and can take up to 72 hours to complete.
  • If the remove operation was called first, then the newly added operations should be uploaded after the remove operation has been completed.
  • Kindly note that Don't mix, create, and remove operations within the same OfflineUserDataJob. Doing so can result in a CONFLICTING_OPERATION error.
  • Enable partial_failure in an AddOfflineUserDataJobOperationsRequest so as to detect any problematic operations before running the job. Operations are validated when uploaded to an OfflineUserDataJob.

If you want to create and remove operations at the same time, you have to create 2 jobs separately and pass these 2 operations to AddOfflineUserDataJobOperationsRequest.

It takes 6 to 48 hours for a list to be populated with members, so you'll most likely see an "In Progress" status (on the Google Ads UI) if you upload to an audience list more frequently than once every 12 hours. 

Daniel Vasilan

unread,
Feb 9, 2024, 4:04:41 AM2/9/24
to Google Ads API and AdWords API Forum
Sorry, what is the true here?

Also, note that remove_all operations are executed hourly, and could run for up to 24 hours"
OR
"
In the removeAll operation, there are two pipelines that run when a remove_all is requested: -- a pipeline that runs a PURGE operation, and another that handles the ADD/REMOVE operations. The first pipeline runs as a daily job and can take up to 72 hours to complete"

If the last one is correct, a 72 hours remove_all job cannot work. Not only because of the extremely large duration, but it also means we will have an empty list for 3-4 days (purge operation + ADD operations together). Would make the list unusable !

That means we have to create 2 jobs for each list, one for add new users, another one to remove the users which must be removed from the list.
For 10 lists, we'd have to create 20 jobs.
Since these jobs also run for up to 24h each and we cannot put them in a Queue to run automatically once the first one finished.. we'll have to run the ADD in one day and the REMOVE in the next day. And have these lists refreshed weekly, probably.
The orchestration complicates, anyway.. even if we leave 24h between them, we see these jobs taking more than officially documented.
 
If that's the only solution, I'll need to go to the Business stakeholders and explain them the limitations of this Google product and see if we can use it for our use cases

Please give me a feedback on my above statements

Thanks!
Daniel

Google Ads API Forum Advisor

unread,
Feb 9, 2024, 10:20:02 AM2/9/24
to dd.va...@gmail.com, adwor...@googlegroups.com
Hi,

Thank you for getting back to us.

Kindly note that remove_all operations are executed hourly, and could run for up to 24 hours. I would recommend you to check this guide Remove all data from the list at once for more information.


To remove all users from a list, set remove_all to true in an OfflineUserDataJobOperation, then issue a RunOfflineUserDataJob request with the resource name associated with the remove_all operation.

Note that when a remove_all operation is included, it must be the first operation in a job. If not, then running the job returns an INVALID_OPERATION_ORDER error. To completely replace the members of a user list with new members, order the operations in AddOfflineUserDataJobOperationsRequest in this sequence:
    • For each new member, add a create operation, setting their UserData in an OfflineUserDataJobOperation.

    Remove operations cannot be mixed with create operations in a single job. Running such a job will fail with a CONFLICTING_OPERATION error.
    Reply all
    Reply to author
    Forward
    0 new messages