Transactional tasks delayed

72 views
Skip to first unread message

Okku Touronen

unread,
Aug 10, 2020, 1:30:24 PM8/10/20
to Google App Engine
Hello, around the 7th aug we started to notice warning messages from the system that some tasks had not run correctly. And from some investigation we have a strong suspicion that transactional tasks (task started in a transaction with flag transactional=True) are malfunctioning. We noticed they didnt run immediately, but where sometimes delayed more than 30 sec. 

We did have a similar issue back in jan where the transactional task didnt run at all, which Google acknowledged and fixed. 

Anyone else see this? 

(We run on Python 2.7)

Elliott (Cloud Platform Support)

unread,
Aug 12, 2020, 3:03:40 PM8/12/20
to Google App Engine
Hello Okku,

I may look into this for you if you provide a timestamp with date, time and timezone.

Okku Touronen

unread,
Aug 13, 2020, 10:48:27 AM8/13/20
to google-a...@googlegroups.com
Thanks for the reply, we have a support case for this now. If you have access its number: 24605582

We have been doing some more analysis of this and we think there is some issue with the task scheduler. It only schedules 1/s when the queue is empty and never reaches a higher rate than 5/s. Our queues are set to max rate 500/s with a bucket size of 100. So I might be wrong with the first assumption that this is related to transactional tasks. But still there is a major issue.

We did a simple test:

def test_task(self):
count = int(self.request.get("count",3))
for i in range(0,count):
task.add_task_transactional(url=webapp2.uri_for('nop_task',id=i))

def nop_task(self, id):
logging.info("ID = %s",id)

Running this we get the following log (all tasks should more or less run in parallell) instead the last task runs more than 30sec later than the first task:
 

På 12 augusti 2020 kl. 21:04:16, 'Elliott (Cloud Platform Support)' via Google App Engine (google-a...@googlegroups.com) skrev:

Hello Okku,

I may look into this for you if you provide a timestamp with date, time and timezone.

--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/7ZCMzzcEUlA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-appengi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/643d352c-7129-4984-9970-611026e6ddc3o%40googlegroups.com.

Alexis (Google Cloud Platform Support)

unread,
Aug 14, 2020, 1:59:02 PM8/14/20
to Google App Engine
Hello Okku,

I looked into your case and even though I cannot divulge private information in a public post, there is this[1] article that states that the max is five transactional tasks into the task queues during a single transaction. It also says that the tasks are enqueued, rather than acting in parallel. Obviously, if they happen fast, they may appear as parallel. But in this case, they are slow from your perspective. And I think what you are asking is why they are throttled every second when they take ms to complete (as shown in the picture above), rather than what the ceiling should be.

For example, in the picture, the 33rd second of 23:26 has 5 tasks and you should aim to reach that consistently if you want it to be faster (but it will never fully be parallel since they are enqueued). If you have them all in the same run_in_transaction() function, that could be the cause of it. It's hard to say what should and shouldn't be the best pattern due to the semantics of your application, but I think it might be best to look at it in reverse and try to figure why certain tasks can happen 5 times a second and then duplicate that as an appropriate pattern in your application.

The bug that was confirmed previously was more of an outage. Engineers can relate those delays to outages. However if you are experiencing this three times with months apart, a secondary problem could be the root cause.

I understand that your initial question was if others are having this issue and I will let others answer, but I did some research for that too and didn't find it to be common. If you need further specific assistance, it might best best to do a support ticket. Otherwise, try to isolate why certain tasks are faster than others in comparison with your business logic.

Reply all
Reply to author
Forward
0 new messages