Sharing a queue among many servers

13 views
Skip to first unread message

Marc Esher

unread,
Aug 16, 2010, 9:41:34 PM8/16/10
to cfc...@googlegroups.com
Hi all,
Since it's been so long that cfcdev has had a message, I know you'll
all put down the beers, put the spouses and kids to bed, and jump at
the chance to pitch in here.

I'm whipping up a prototype app -- not for work, not for homework --
that I'm thought-experimenting about "how would I run this in a
multi-server environment?". I'm not looking for code samples or
anything really specific, just options.

Imagine you have a *lot* of background work to do. You have an unknown
-- and potentially elastic -- set of servers with which to do the
work. It's probably easiest to think of it as if you were running this
thing on Amazon EC2 or another service. You have a "queue" of work to
be done. You have worker servers to do the work.

What are your options for scheduling that work with ColdFusion? In an
ideal world, there'd be a shared queue, and workers would "take()" off
of that queue, with zero contention... i.e. as soon as one server
takes off the queue, that task is immediately unavailable to another
server that also attempted to take it.

I'd normally use java for this, but I want to use CF to see how badass
it can get in a situation such as this one. And this means no java
Timers, TimerTasks, or my dear friends the ExecutorService and its
wonderful relations... straight CF. I'm not opposed to event gateways,
though I'd really like to stay away from a JMS server if I can help
it. Importantly, it needs to be fairly easy to debug, which is
always a problem in cases such as these.

Thoughts?

Oh, I know: this is like when the boss comes in and says "we need it
fast, we need it simple, and we need it now". I have a good idea of
how I'd do this with java, but I'm mostly interested in what CF could
provide.

Thanks!

Marc

Oscar Arevalo

unread,
Aug 16, 2010, 9:49:20 PM8/16/10
to cfc...@googlegroups.com
Hi,
I think Amazon already has a queueing service of some sort. SQS I believe. If I recall correctly it is intended exactly for that purpose. Or you are aware of it and just wanted to do the equivalent in CFML?



--
You received this message because you are subscribed to the Google Groups "CFCDev" group.
To post to this group, send email to cfc...@googlegroups.com.
To unsubscribe from this group, send email to cfcdev+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cfcdev?hl=en.




--
Oscar Arevalo
http://www.coldbricks.com - Content Management System
http://www.oscararevalo.com


Cody Caughlan

unread,
Aug 16, 2010, 9:49:52 PM8/16/10
to cfc...@googlegroups.com
Not sure if you're looking for something that is implemented in CF/Java, but...

I am a huge fan of backgrounding as much as possible and my first tool
of choice is Beanstalkd.

http://kr.github.com/beanstalkd/

It has a very simple text based protocol (think memcached). It allows
you to set deadlines on jobs and to bury jobs for future review. It
supports job prioritization and delayed starts.

It is also very very fast.

There is a Java client - probably not too hard to wrap it in a CFC.

I love me some Beanstalkd.

/Cody

On Mon, Aug 16, 2010 at 6:41 PM, Marc Esher <marc....@gmail.com> wrote:

Barney Boisvert

unread,
Aug 16, 2010, 10:06:36 PM8/16/10
to cfc...@googlegroups.com
In the absence of both JMS (or SQS if you're really on EC2) I'd use a
DB as a queue. You need some sort of centralized repository, and DBs
are good at that.

component Queue {

function put(message) {
query("insert into queue (timestamp, status, message) values
(now(), 'new', '#serializeJson(message)#');
}

function has() {
return query("select id from queue limit 1").recordCount;
}

function take(workerId) {
q = query("select id, message from queue where status =
'in-progress-#workerId#' limit 1");
if (q.recordCount == 0) {
query("update queue set status = 'in-progress-#workerId#' where
status = 'new' limit 1");
return take(serverId);
}
return {
id: q.id,
message: deserializeJson(q.message)
};
}

function complete(messageId) {
query("update queue set status = 'complete' where id = #messageId#");
}

}

No, it's not amazingly feature rich, snazzy, cutting edge, etc. But
if you need CF and you need it quick'n'dirty it'll get the job done.
And no, I doubt that'll compile. But you get the drift. Just drop
that CFC on every server that needs it and let 'em at it.

As for scheduling the 'take's, you're kind of stuck without some kind
of scheduling framework, and you ruled all those out. ;) I the
absence of one of those you could use a method like this:

function workerThread() {
thread.run {
workerThreadInternal();
}
}
function workerThreadInternal() {
id = createUUID();
while (true) {
if (application.queue.has()) {
msg = application.queue.take();
application.worker.doIt(msg.message);
application.queue.complete(msg.id);
Thread.sleep(100);
} else {
Thread.sleep(10000);
}
}
}

Then invoke workerThread in OnApplicationStart to kick 'er off.

This has a pile of downsides, which is why stuff like JMS and SQS
exist. If you have alternatives, I wouldn't vote for something like
this, though I've certainly used this sort of thing more than once in
real apps.

cheers,
barneyb

On Mon, Aug 16, 2010 at 6:41 PM, Marc Esher <marc....@gmail.com> wrote:

> --
> You received this message because you are subscribed to the Google Groups "CFCDev" group.
> To post to this group, send email to cfc...@googlegroups.com.
> To unsubscribe from this group, send email to cfcdev+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/cfcdev?hl=en.
>
>

--
Barney Boisvert
bboi...@gmail.com
http://www.barneyb.com/

Judah McAuley

unread,
Aug 16, 2010, 10:56:32 PM8/16/10
to cfc...@googlegroups.com
Echoing the others so far, I'd use JMS.

But if you want a CF-only solution, I'd probably use a pub-sub model.
Have each worker come up and register itself with the scheduling
server saying "I'm ready to work and here is my contact url". The
scheduler server keeps a local store of the known workers and the urls
to send work notifications to.

When a job becomes available, queue it up locally and then broadcast
the availability message to all the worker processes in parallel
(cfthread to spawn cfhttp connections) with the id of the job to take
and the url to get to take it. When a worker gets the message it looks
to see if it is currently busy. If it is, it ignores the message. If
it is not busy, it tries to contact the scheduler with the job token
and "take" the job.

The scheduler locks the job table, assigns it to one of the incoming
requests, then moves it from the pool of available jobs. You would
probably want to have a recheck interval to make sure that the job got
picked up and if it didn't, broadcast it again and also have the
worker services notify when the job was complete, success/failure,
etc.

One other thought I had was to use Railo's Server scope as a shared
queue between scheduler and the workers instead of a database. The
upside would be that it would avoid the db locking but on the downside
it wouldn't necessarily have the state persistence in case of server
errors, etc. I'd probably favor the db over the server scope for
reliability but you might be able to find a decent mix of the two
where you persist every change to the server scope queue into the
database, etc.

Judah

On Mon, Aug 16, 2010 at 6:41 PM, Marc Esher <marc....@gmail.com> wrote:

Ronan Lucio

unread,
Aug 17, 2010, 8:19:40 AM8/17/10
to cfc...@googlegroups.com
Hi Marc,

If you want a really stable, high-available and elastic queue system, I think Amazon SQS is the answer, independently from the application's language, hosting or technology.
Cloud computing it on the way for that.

Look at this example:
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1602&categoryID=55

Of course all other suggestions (in the thread) are valid ones, but, thinking specifically about our question:


Oh, I know: this is like when the boss comes in and says "we need it
fast, we need it simple, and we need it now"

Unless you have that work done, SQS seems to be the fastest, stable and high-available way.

Ronan

Marc Esher

unread,
Aug 19, 2010, 7:38:38 PM8/19/10
to cfc...@googlegroups.com
Great. Thanks for all the feedback, folks.

Marc

Reply all
Reply to author
Forward
0 new messages