celery and race conditions

Jonathan Vanasco

unread,

Apr 14, 2014, 5:45:25 PM4/14/14

to sqlal...@googlegroups.com

I just ran into an issue where it looks like I could have race conditions using SqlAlchemy and Celery. Wondering if anyone here has some ideas.

Here's the scenario:

A1 Process A - Pyramid - Creates SQLalchemy session.

A2 Process A - Pyramid - Creates data

A3 Process A - Pyramid - flushes data

A4 Process A - Pyramid - fires off an async request to Celery

A5 Process A - Pyramid - more operations

A6 Process A - Pyramid - commits.

B1 Process B - Celery - gets async request

B2 Process B - Celery - creates sqlalchemy session

B3 Process B - Celery - starts pulling data from the database

B4 Process B - Celery - starts writing data to the database

B5 Process B - Celery - Commit

The problem I foresee, is that the B series of events could happen between A4 and A6. By luck, it appears they're not happening until after A6. There's nothing in my code that should / could be ensuring that.

Has anyone else dealt with this?

Michael Bayer

unread,

Apr 14, 2014, 6:09:25 PM4/14/14

to sqlal...@googlegroups.com

Sure, you need to start your celery process after the commits.

I usually gather up “post-commit” tasks in some kind of list and then iterate through them after the commit to run them.

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Jonathan Vanasco

unread,

Apr 14, 2014, 7:12:12 PM4/14/14

to sqlal...@googlegroups.com

i've got that now as a stopgap; i was hoping someone has better ideas. i don't like the idea of a post-commit hook, because i fear requesting the celery task request will create an error. I really don't want to build `transaction` support for celery, but i might need to.

Wichert Akkerman

unread,

Apr 15, 2014, 4:47:24 AM4/15/14

to sqlal...@googlegroups.com

On 15 Apr 2014, at 01:12, Jonathan Vanasco <jona...@findmeon.com> wrote:

i've got that now as a stopgap; i was hoping someone has better ideas. i don't like the idea of a post-commit hook, because i fear requesting the celery task request will create an error. I really don't want to build `transaction` support for celery, but i might need to.

That isn’t an uncommon scenario; I touched upon that in http://www.wiggy.net/articles/task-queues as well. For rq I am using a variant of https://gist.github.com/wichert/10714681 . One extra problem you need to take into account is that you are likely to run into problems when one of the arguments is a SQLAlchemy ORM instance: when your function is later run in another process that instance won’t be associated with the current session, so you need to merge it.

Wichert.

Jonathan Vanasco

unread,

Apr 15, 2014, 12:29:25 PM4/15/14

to sqlal...@googlegroups.com

if i have any time after shipping , i'll probably build in transaction support for celery and pyramid.

I keep away from tossing ORM objects around the system. GETS are pretty cheap.

my task arguments are generally:

int = primary key of ORM object

dict = "instructions" payload of what to do

for image processing, it's generally

optimize_images( orm_id , { file_b64=BLOB , selected_resizes=[], } )

celery grabs and uses it's own session, pulling data off the database. it optimizes the image, archives it to S3 and does the same for the resizes. then updates the database to reflect all the filesizes and locations.

Reply all

Reply to author

Forward