Abnormally high number of cancelled/failed batch jobs

282 views
Skip to first unread message

GregT

unread,
Feb 27, 2017, 5:55:55 PM2/27/17
to AdWords API Forum
Hi,

The past few days, we've had an abnormally high number of failed or cancelled jobs. Most days, we might have one or two, but we've had hundreds of them over the past few days, and it appears to be getting worse.  We don't implement any kind of batch job cancelling, so we have not cancelled any of the jobs that are coming back as cancelled. Some end up with the status of cancelled and no kind of errors, and others end up with that status along with the processing error BatchJobProcessingError.INTERNAL_ERROR.

It doesn't appear to be only for large jobs -- some of these jobs just have a couple hundred items in them.

Here's a few of the job ids from the last few days:

449947201
449933684
449948656
449962867
450157971
449942291
450165792
450151143
450440619
450250571
450474459
450474480
450250934
450258277
450250217
450474525
450443074
450447055
450424781
450644013
450424784
450425831
450450919
450450943
450630987
450650571
450655104
450466378
450465178
450687648
450687759
450449945
450692790
450469244
450496363
450735687
450535240
450537328
450538384
450557909
450557915
450557603
450558140
450790359
450789465
450584875
450585478
450790491
450790500
450585643
450585658
450585673
450585688
450585694
450789978
450584944
450789984
450585520
450789987
450583243
450584953
450792171
450794268
450589843
450589876
450589219
450565799
450589969
450589975
450565832
450589996
450792576
450567439
450590023
450590035
450589711
450590050
450589714
450565844
450590074
450590086
450589735
450590107
450589741
450590119
450565874
450589759
450792606
450590299
450590329
450590341
450798222
450798798
450594850
450798810
450798813
450798819
450594871
450571265
450800061
450605098
450612475
450612574
450613123
450816012

Thanks,
Greg

Peter Oliquino

unread,
Feb 27, 2017, 11:07:26 PM2/27/17
to AdWords API Forum
Hello Greg,

To help us investigate further, could you please provide us the SOAP logs for both request and response of one of your failed BatchJobService if you were able to capture them. Please reply using Reply privately to author when sending the requested information. Additionally, you may also inspect the downloadUrl and the list of processingErrors in the BatchJob to get possible reasons for the failure.

Best regards,
Peter
AdWords API Team

GregT

unread,
Feb 28, 2017, 11:00:44 AM2/28/17
to AdWords API Forum
Hi, Peter.

We don't actually log the SOAP for these because the SOAP calls themselves aren't failing -- it's just that the status of the jobs are eventually getting returned as cancelled. As mentioned in the original post, some have no processing or other errors at all - just the status cancelled. Others have that status and have a single processing error: BatchJobService.INTERNAL_ERROR, with no other information in the processing errors. An example of the printout of all the fields we get for that error is:

errorType:BatchJobProcessingError, trigger:, errorString:BatchJobProcessingError.INTERNAL_ERROR, fieldPath:, reason:INTERNAL_ERROR

A couple example jobs from yesterday that return the error (along with being cancelled):
450605098
450816012

A couple that have no error, but are just cancelled:
450865617
451028751

All the ones listed in the original post fall into one of these two categories, I believe.

Thanks,
Greg

Nadine Sundquist (AdWords API Team)

unread,
Mar 1, 2017, 10:44:39 AM3/1/17
to AdWords API Forum
Hi Greg,

Thanks for all the helpful details! I started investigating on our servers to track down why this is happening. On my initial pass of our logs, I don't see a reason for the cancel happening when it appears that the job is finishing (whether it has errors or not). I'm pulling in a few more people here to take a look to help me solve this mystery. I'll get back to you when we've made a bit more progress on this.

Best,
Nadine, AdWords API Team

Nadine Sundquist (AdWords API Team)

unread,
Mar 1, 2017, 2:20:33 PM3/1/17
to AdWords API Forum
Hi Greg,

One of the engineers on my team found the root cause. We seem to be having a few difficulties with operations that have product partitions. We're currently putting some code in place to retry when those particular operations have issues.

Cheers,
Nadine, AdWords API Team

GregT

unread,
Mar 1, 2017, 2:31:09 PM3/1/17
to AdWords API Forum
Great - thanks! 

jor...@jcrocker.uk

unread,
Jun 27, 2017, 5:25:31 AM6/27/17
to AdWords API Forum
Hi Nadine,

I am currently facing the same situation as Greg in that all jobs I send are automatically going to a Cancelled status. These jobs are for product partitions.

A few example job ID's:

520125237
520125243
519862123
520125246
519445444
520125249
520125240

These are all for account 682-998-6792.

Could this be related to the same cause as Greg's issue?

Kind regards,
Jordan

Nadine Sundquist (AdWords API Team)

unread,
Jun 27, 2017, 5:18:31 PM6/27/17
to AdWords API Forum
Hello Jordan,

I took a look on our servers, and at first glance it does look like it could be the same issue. I've sent on the job IDs to my teammate who's assigned to this. I'll get back to you when I hear back. This looks like a tough one. Thank you for providing these job IDs.

Take care,
Nadine, AdWords API Team

si...@hotsnapper.com

unread,
Jul 5, 2017, 5:32:44 AM7/5/17
to AdWords API Forum
Hello Nadine, 

I am working alongside Jordan do we have an update?

Regards

Simon

Nadine Sundquist (AdWords API Team)

unread,
Jul 5, 2017, 8:23:27 AM7/5/17
to AdWords API Forum
Hello Simon,

It looks like we have found the root cause. Now that the holidays are winding down here in the US, we can look at trying to find a solution. I'll keep you in the loop.

Best,
Nadine, AdWords API Team

Nadine Sundquist (AdWords API Team)

unread,
Jul 6, 2017, 9:45:53 AM7/6/17
to AdWords API Forum
Hello Simon,

The fix is now on our production servers. Please give it another try. If you're still experiencing issues, please get back to me.

Thanks,
Nadine, AdWords API Team

si...@hotsnapper.com

unread,
Jul 7, 2017, 9:44:18 AM7/7/17
to AdWords API Forum
Hello Nadine,

Thank you for fixing this error for us. I will test it out and if we run into any issues Jordan or myself will get back in contact.

Regards

Simon

jor...@jcrocker.uk

unread,
Aug 15, 2017, 4:57:05 AM8/15/17
to AdWords API Forum
Hello Nadine,

We are experiencing this problem again, with the following sample jobs IDs:

546679208
546680861
546680843
546739639
546680837
546740809
546740803
546738088
546740836
546679832
546680852
546739642
546738997
547027008
546680846
546680882
546738103
546740464
546740800
546679220
546740467
546740839
547027005
546679214
546740797
546740812
546680849
546679205
546679217
546679835

Thanks for your help.

Kind regards,
Jordan

GregT

unread,
Aug 15, 2017, 12:15:08 PM8/15/17
to AdWords API Forum
We also are. Here are a few sample job ids from us, in case it helps:

546629438
546628553
546690922
546964719
546610328
546608684
546962208
546611558
546606977
546961335
546605693
546959982
546959979
546958008
546956727
546676590
546676494
546676485
546676170
546354463
546353035
546338374
546338884
546338836

Thanks,
Greg

Nadine Sundquist (AdWords API Team)

unread,
Aug 16, 2017, 3:16:56 AM8/16/17
to AdWords API Forum
Greetings All,

I've filed an issue on this so we can dive deeper into why this is happening. I'll get back to you when we've made more progress on why this is happening again.

Best,
Nadine, AdWords API Team

jor...@jcrocker.uk

unread,
Sep 19, 2017, 6:11:24 AM9/19/17
to AdWords API Forum
Hi Nadine,

Sorry to open this thread again but we are experiencing the same problem again.

Example Job ID's:

567961108
568327521

I believe the number of partition operations is under the prescribed limit, but the jobs are going straight to cancelled and not providing any error messages.

Please can you advise urgently.

Kind regards,
Jordan

Nadine Sundquist (AdWords API Team)

unread,
Sep 19, 2017, 4:14:29 PM9/19/17
to AdWords API Forum
Hello Jordan,

This issue looks a bit different. Usually no results back means there was an unrecoverable error. At this point, I'm not sure what that is. I'll hunt that down for you to figure out what's going on. I'll get back to you when I have the reason.

Regards,
Nadine, AdWords API Team

Nadine Sundquist (AdWords API Team)

unread,
Sep 20, 2017, 10:10:49 AM9/20/17
to AdWords API Forum
Hi Jordan,

So, I now have a reason for why the most recent jobs are failing. In short, you've found a limitation of batch job service and brought it to its limits. Product partition operations for the same ad group must be submitted together and can't be broken up. The problem is that on the back-end, while we are processing them, we're getting back a REQUEST_SIZE_LIMIT_EXCEEDED error meaning they can't be submitted as a block. Usually, when batch job service gets this kind of error, it silently retries with fewer operations, but that is not possible for product partitions. So, the job fails. Thanks for bringing this to light, and we're exploring a few ways to solve this. In the meantime, please try submitting fewer product partition operations at a time. I'll update you when I have more, but this looks like it could be a tough one to solve, so it's not one of those issues that can be solved overnight.

Best,
Nadine, AdWords API Team
Reply all
Reply to author
Forward
0 new messages