Multiple sidekiq processes running on a single machine... seems like both try and work the same job.

488 views
Skip to first unread message

Altonymous

unread,
May 6, 2015, 5:37:35 PM5/6/15
to sid...@googlegroups.com
I have started multiple sidekiq processes on one of my machines.  I'm not positive and I certainly don't want to scream the sky is falling without evidence.  

So is there anything I can do to determine that in some cases either different processes/workers are picking up the same job?   Each one of my processes is running 10 threads.  The reason I think this is because the worker is responsible for downloading a file from S3 and importing it to a postgresql database.  In some cases I get errors in my Sidekiq jobs that state the file no longer exists...

"AWS::S3::Errors::NoSuchKey: No Such Key"

This signifies, to me, that a queueing worker found there was work; enqueued the work for my parsing workers; and more than one worker picked up the job.  The 2nd one being a little to late to the party.  Because the first thing the worker does when it gets a job to parse the file is move it to a processing/ folder on S3 to help ensure nobody else tries to parse the file too.


If what I'm saying makes sense and it's possible, is there a way I can prevent this?

Abhi Rao

unread,
May 6, 2015, 6:49:39 PM5/6/15
to sid...@googlegroups.com

Do you have retries configured? Based on your description if the worker gets a failure after pulling the file it could retry later which would be the second attempt.


--
You received this message because you are subscribed to the Google Groups "Sidekiq" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sidekiq+u...@googlegroups.com.
To post to this group, send email to sid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sidekiq/f5931632-4fed-4d99-bed1-ab93b8dbeed9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chris Altman

unread,
May 6, 2015, 7:26:42 PM5/6/15
to sid...@googlegroups.com
That's a very good point!  That's probably exactly what it is.  Thanks for thinking of that!

Chris

Altonymous

unread,
Sep 24, 2015, 8:18:01 PM9/24/15
to Sidekiq
I'm still encountering this issue.  I've believe I have ruled out that jobs are getting retried and causing issues.  

It might be that jobs are getting double enqueued.  However, I'm not sure how to track that down.  I am using Sidetiq to set up recurring jobs.  The worker that does this work is a single process running a single worker.  So I'm not sure how it could be double enqueueing work.  

Queue Worker - 1 Machine, 1 Process, 1 Worker.
Parse Worker - 1 Machine, 2 Processes, 10 Workers.

Every 5 seconds it scans an S3 `upload` bucket for new files.  When it finds them it moves them from the `uploads` bucket to a `parsing` bucket.  After it has moved the file it enqueues work for the Parse Worker to import the file into the database.  The Parse Worker is where the issue is occurring.  It seems that more than one job is being enqueued or more than one worker is picking up the same job.  I assume it's a double enqueue.  I wrote a fair amount of code in the Queue Worker to try and avoid 2 jobs getting pushed onto the Parse Worker queue I'm using redis to check if the file has already been seen, and skipping it if it has.. yet somehow more than 1 import of a single file is still being attempted.

Mike Perham

unread,
Sep 24, 2015, 11:52:00 PM9/24/15
to sid...@googlegroups.com
Chris, preventing duplicate processing can be very difficult / racy.  I know that Sidekiq Enterprise's cron impl goes to great lengths to prevent duplicate cron job execution but I don't know what measures Sidetiq takes.  It's possible it's creating duplicate jobs but I'd hope it's mature enough by now to have any race conditions worked out of it.

In other words, ¯\_(ツ)_/¯

Mike

--
You received this message because you are subscribed to the Google Groups "Sidekiq" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sidekiq+u...@googlegroups.com.
To post to this group, send email to sid...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Mike Perham – CEO, Contributed Systems
Smart, effective open source infrastructure for your apps.
Reply all
Reply to author
Forward
0 new messages