Best practicies for making GAE microservices idempotent

Nilson Pontello

unread,

Oct 6, 2017, 6:15:51 PM10/6/17

to Google App Engine

Hi everyone,

I have an event driven architecture built on top of google cloud pubsub. Because it guarantees At-Least-Once Delivery, my subscribers are required to be idempotent when processing messages.

The only way I've found to make them idempotent is by using database transactions and I am looking forward to finding a better solution for it.

So, what else GCP offers for making my appengine services idempotent?

Many thanks

George (Cloud Platform Support)

unread,

Oct 6, 2017, 7:39:14 PM10/6/17

to Google App Engine

Hello Nilson,

Among other sources, you may benefit from reading the "Implementing Workflows on Google App Engine with Fantasm" article. Fantasm is an excellent launch pad for building an understanding of how to decompose a workflow into the appropriate chunk size as well as gaining a solid footing in how to build idempotent states.

Jason Collins

unread,

Oct 7, 2017, 10:45:25 PM10/7/17

to Google App Engine

Fantasm! That's a blast from the past!

I'm one of the original authors of that package. It does indeed help to make things idempotent, but at the end of the day, it relies on a datastore transaction (fronted by memcache for some performance gains): https://github.com/iki/fantasm/blob/master/src/fantasm/lock.py#L150

Datastore transactions are one of the few things you can "count on" in a distributed system like App Engine.

Another tool that is often helpful are named tasks, which can help ensure that you don't requeue an already queued task (e.g., in the event of a retry): https://cloud.google.com/appengine/docs/standard/python/taskqueue/push/creating-tasks#naming_a_task

Finally, keep in mind that you can also enqueue a task transactionally with a Datastore transaction: https://cloud.google.com/appengine/docs/standard/python/ndb/transactions#python_Transactional_task_enqueuing

With some creativity, you can get a long way with these basic building blocks.

Nilson Pontello

unread,

Oct 9, 2017, 7:48:00 AM10/9/17

to Google App Engine

Thanks Jason/George,

My code does a job very similar to lock.py (this is what I am trying to avoid).

So it looks like the secret is around task queues. Can I assume that enqueued tasks will be delivered just once (if my endpoint returns 200 OK)?

If yes then my problem is solved. But if it behaves similar to pubsub which guarantees "At-Least-Once Delivery" then it will be useless for my case.

BTW: If those named tasks are hitting datastore for deduplication then lock.py's approach is still valid and its performance will be as good as named tasks.

Thanks

Jason Collins

unread,

Oct 9, 2017, 9:28:04 AM10/9/17

to Google App Engine

No. You can't make that assumption about task queue delivery, though it's very, very good. This is why Fantasm had to go the extra distance with Datastore transactions.

BTW, Fantasm allows you to turn off the Datastore run-once check and rely solely on task queue because in practice, task queue does a great job. Just not a perfect one. So you have to choose between the small performance hit of Datastore transaction, or very small chance of duplicate task delivery.

BTW I can't recall the named task deduplication ever failing when enqueuing a task.

--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/p8Xcap4eaKU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/4ccbccfa-e90a-4ac0-a125-503ba343e2ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nilson Pontello

unread,

Oct 9, 2017, 10:32:13 AM10/9/17

to Google App Engine

Thanks for the clarification, It looks like I will have to keep hitting datastore for my use case.

BTW: Can Fantasm benefit from making lock transactions using cloud spanner instead of datastore? or is datastore faster for small transactions like the ones performed by lock.py?

Thanks

Jason Collins

unread,

Oct 9, 2017, 11:32:28 AM10/9/17

to Google App Engine

Sorry, I don't have operational experience with Cloud Spanner. It didn't exist when Fantasm was written.

Attila-Mihaly Balazs

unread,

Oct 10, 2017, 1:11:43 AM10/10/17

to Google App Engine

Just a quick note: while GAE supports named tasks and transactional enqueuing of tasks, it does not support the transactional enqueing of named tasks.

Just a small detail to be aware of when architecting your application.

Attila

Reply all

Reply to author

Forward