Hello support team,
We currently maintain a system that queries GAM for business-critical data used by Oracle Moat’s and Google’s paying customers, and that system has reached end-of-life. We are porting the query workloads to a newer system on Airflow and have reached a blocker where these queries are timing out in our new implementation. We will not be able to continue supporting our customers’ business critical workflows without a resolution on this issue, so we kindly ask for your help in troubleshooting this issue, as the error is rather opaque on our end, beyond the recommended troubleshooting steps from the GAM API docs.
We followed the official guide https://developers.google.com/ad-manager/api/reporting of how to run and download a report with the API. We used the Python Google Ad Manager client library https://github.com/googleads/googleads-python-lib with version 32.0.0 and the API version V202205. We started with create an ad hoc report query https://developers.google.com/ad-manager/api/reporting#building_a_reportquery, then created the report job and waited it to be done https://developers.google.com/ad-manager/api/reporting#creating_the_reportjob, finally downloaded the report https://developers.google.com/ad-manager/api/reporting#downloading_the_report.Most of the jobs can complete successfully. However, there are a few jobs that failed in the last step of downloading the report. Specifically, we got a timeout error when trying to fetch the download URL of the report https://developers.google.com/ad-manager/api/reference/v202205/ReportService#getreportdownloadurlwithoptions. Here the logs of the failure.
Traceback (most recent call last):
File "/dataimport_airflow/dataimport_airflow/dags/adservers/dfp/dfp_client.py", line 119, in download_report
report_downloader.DownloadReportToFile(report_job_id, export_format, report_file)
File "/home/airflow/.local/lib/python3.7/site-packages/googleads/ad_manager.py", line 826, in DownloadReportToFile
report_url = service.getReportDownloadUrlWithOptions(report_job_id, opts)
File "/home/airflow/.local/lib/python3.7/site-packages/googleads/common.py", line 985, in MakeSoapRequest
*packed_args, _soapheaders=soap_headers)['body']['rval']
File "/home/airflow/.local/lib/python3.7/site-packages/zeep/proxy.py", line 51, in __call__
kwargs,
File "/home/airflow/.local/lib/python3.7/site-packages/zeep/wsdl/bindings/soap.py", line 127, in send
response = client.transport.post_xml(options["address"], envelope, http_headers)
File "/home/airflow/.local/lib/python3.7/site-packages/zeep/transports.py", line 108, in post_xml
return self.post(address, message, headers)
File "/home/airflow/.local/lib/python3.7/site-packages/zeep/transports.py", line 75, in post
address, data=message, headers=headers, timeout=self.operation_timeout
File "/home/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 578, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/requests/adapters.py", line 529, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='ads.google.com', port=443): Read timed out. (read timeout=3600)
The confusing part is that, when we tried to reproduce error in the local development environment, the issue was gone. We were able to get the download URL of the report with the generated report job id and download it successfully. The response would be returned in 10 to 30 minutes.
We hope that you could help us investigate the issue and provide any insight into how to resolve it in our Airflow application.
Let me know if you need any additional information for the troubleshooting. Thanks so much!
Best wishes,
--
|
||||||