Hi,
recently we recognized some batch job issues. More precise, we see 3 problems:
1) Strange progress stats
id = 535340862
status = "DONE"
progressStats =
(ProgressStats){
numOperationsExecuted = 0
numOperationsSucceeded = 0
estimatedPercentExecuted = 0
numResultsWritten = 7414
}
downloadUrl = ...
The linked result file indeed contains 7414 entries which indicate a successful execution of operations.
numOperationsExecuted etc. and the other stats do not reflect that.
2) Missing download URL
id = 535178174
status = "DONE"
progressStats =
(ProgressStats){
numOperationsExecuted = 25000
numOperationsSucceeded = 25000
estimatedPercentExecuted = 100
numResultsWritten = 25000
}
... no download url ...
Obviously this is problematic as we cannot process results.
3) Duplicates in result files
id = 535170169
status = "DONE"
progressStats =
(ProgressStats){
numOperationsExecuted = 25000
numOperationsSucceeded = 25000
estimatedPercentExecuted = 100
numResultsWritten = 45000
}
downloadUrl = ...
Looking into the linked result file, many results indeed are reported twice (for instance setting bid modifier on criterion 503006 in ad group 41233022792 to 0.75). Duplicates are not following each other directly, so it is not easy to filter them w/o maintaining a lookup for all seen results in memory. Also, with regard to this example, it is
not the case that results are unique within the first 25000 results. Then we could simply discard the rest. Duplicates can be found already within the first 25000 result entries.
The first issue is not that problematic since we can treat the result file as the authority. However, issue 2 and 3 are real problems for us.
Thanks in advance for any support on these issues!
Best,
Christian