Ruby Multi-threaded assistance request...

1 view
Skip to first unread message

Mel Riffe

unread,
Nov 17, 2015, 5:21:25 PM11/17/15
to refac...@googlegroups.com, rub...@googlegroups.com
Hey Folks!

I have a situation that I think needs some multi-threaded-type programming, which is outside my comfort zone.

There's a long-running process my client is trying to shorten by starting multiple SSH terminal sessions and kicking off concurrent runs. During this long-running process there comes a point where things get number, more or less, sequentially. This number needs to be unique within the entire database. But I can't use an auto-incrementing database field because the value, in part, is derived from a specified date. But I digress.

I'm using a database table to contain the 'next' value, incrementing it once it's been used.

Using ActiveRecord's lock didn't prevent duplicates.

Using Thread.exclusive around the 'next' operation didn't prevent duplicates.

Using Thread.exclusive at a level higher, prevent duplicates within a set but still allows duplicates across sets.

I'm truly out of ideas. Right now all suggestions are welcomed and greatly appreciated.

--Mel

Charles Feduke

unread,
Nov 17, 2015, 5:42:37 PM11/17/15
to refac...@googlegroups.com, rub...@googlegroups.com
If you are using PostgreSQL you can use advisory locks (see http://www.postgresql.org/docs/current/static/explicit-locking.html section 13.3.5).

However you should also be able to incorporate the date into a sequence through a function and a table trigger once again, if using PostgreSQL.

Other RDBMSes should have something similar to PGSQL's advisory locks (I know both MS SQL Server and Oracle do).

Is it possible to reorder the problem so the number is generated first from a serial access number generator, and then utilize it later? That is, if a process with a reserved number fails, is it okay to potentially have holes in the sequence? (This can create edge cases where jobs that kick off around midnight may complete the following day but have the previous day's date in their identifier.)

--
You received this message because you are subscribed to the Google Groups "RefactorRVA" group.
To unsubscribe from this group and stop receiving emails from it, send an email to refactorrva...@googlegroups.com.
To post to this group, send email to refac...@googlegroups.com.
Visit this group at http://groups.google.com/group/refactorrva.
For more options, visit https://groups.google.com/d/optout.

Jamie Orchard-Hays

unread,
Nov 17, 2015, 7:42:41 PM11/17/15
to refac...@googlegroups.com, rub...@googlegroups.com
Seems like the FB should be able to give you an incremented number quite easily. It won't matter which thread or process accesses when, since the FB is coordinating it.

What FB are you using?

Another strategy is to use UUIDs rather than Ints. They'll be unique no matter where you generate them. 

(sent from an iPhone)

On Nov 17, 2015, at 7:02 PM, Justin Etheredge <jus...@etheredge.us> wrote:

Using lock in Activerecord causes a 'select ... for update'. This should cause any operations that occur outside of the existing transaction to wait on that lock to release. Is it possible you're not using lock within an explicit transaction?

On Nov 17, 2015, at 6:32 PM, Jacques Fuentes <jpfue...@gmail.com> wrote:

I'm with Charles on this one. What you want to do is better served by using explicit locking or via triggers/functions within the right transaction isolation levels. Your RDBMS (excluding MySQL MYISAM) is built to solve these types of problems much easier than in Rails.

Further, it sounds like these "multiple SSH runs" are `rake` commands that loads the Rails environment? Are they creating multiple processes? If so, threads won't save you. In the case of a forked webserver (unicorn & some versions of passenger & puma), threads won't save you.

Jamie Orchard-Hays

unread,
Nov 17, 2015, 10:35:22 PM11/17/15
to refac...@googlegroups.com, rub...@googlegroups.com
"FB"???? I meant "D

(sent from an iPhone)

On Nov 17, 2015, at 8:58 PM, Bob Larrick <lar...@gmail.com> wrote:

+1 for using UUID if possible.


If there's a significant amount of logic around generating the number i.e. if you have to incorporate the specified date, the angle of saturn, eye of newt, etc. it might be simpler to write a small ruby process that runs outside the main app/rake task.  This could accept the specified date and other relevant data as input and output a unique number.  

Might be more comfortable to stay in Ruby rather than learn the idiosyncrasies of whatever DB you're in, and would provide a simple single point of syncronization.  

initializing with value from DB is an exercise left to the reader :)



-Bob

Jamie Orchard-Hays

unread,
Nov 17, 2015, 10:36:05 PM11/17/15
to refac...@googlegroups.com, rub...@googlegroups.com
"FB"? I meant "DB"!!

(sent from an iPhone)

On Nov 17, 2015, at 8:58 PM, Bob Larrick <lar...@gmail.com> wrote:

+1 for using UUID if possible.


If there's a significant amount of logic around generating the number i.e. if you have to incorporate the specified date, the angle of saturn, eye of newt, etc. it might be simpler to write a small ruby process that runs outside the main app/rake task.  This could accept the specified date and other relevant data as input and output a unique number.  

Might be more comfortable to stay in Ruby rather than learn the idiosyncrasies of whatever DB you're in, and would provide a simple single point of syncronization.  

initializing with value from DB is an exercise left to the reader :)



-Bob

On Tue, Nov 17, 2015 at 7:42 PM, Jamie Orchard-Hays <jami...@gmail.com> wrote:

Mel Riffe

unread,
Nov 18, 2015, 9:58:44 AM11/18/15
to refac...@googlegroups.com, rub...@googlegroups.com
Thanks for the great responses!

First, I'm not using Postgres with this particular project. MySQL InnoDb. I'm not opposed to using Triggers it's just not at the top of my list.

Second, I can't use UUIDs. It's, unfortunately, a combination serial number and widget counter. Based on the a calendar quarter, it is a sequential number assigned to each widget produced. This id is then used, downstream, for data analysis. There's at least 10 years of inertia behind this construct.

Third, I was wondering what Facebook (FB) had to with anything. :-D

Fourth, it is multiple invocations of a rails runner script. However the process (and by this I mean the ruby code that gets executed) doesn't spawn other processes except for queueing jobs to DelayedJob.

Fifth, here is the actual implementation of the widget number generator: https://gist.github.com/melriffe/227fbddc82b66f6db04b I am not using it in an explicit transaction, though. And, by that, I'm assuming ActiveRecord::Base.transaction do ... end, yes?

Right now I'm contemplating moving the generation and assignment of this id outside the main script. I already have access to DelayedJob. I'm thinking I could create a single worker for a named queue, there by forcing a single process through this generation/assignment step (or, at least, that's the theory).

Thoughts? reactions?

Mel Riffe

unread,
Nov 20, 2015, 11:15:35 PM11/20/15
to RefactorRVA, rub...@googlegroups.com
So far, Al's suggestion is the winner. But I want to thank everyone that offered suggestions.

--Mel
Reply all
Reply to author
Forward
0 new messages