Job Executor - Job Starvation

464 views
Skip to first unread message

Rob P

unread,
May 28, 2014, 3:18:23 AM5/28/14
to camunda...@googlegroups.com
Hi Guys,

Ive been playing with Camunda engine under load and it looked like there may have been some job starvation when the system is under load. I drilled down into the SQL select for jobs and for the MySQL version, I notice there is no order by statement. Hence the order in which jobs are processed may be based on the record order returned from the DB, in particular this conceptually could be LIFO. Hence a runnable job may be stuck for some time...

Thus I considered adding an order by statement based on due date. However not all jobs have a due date.

Hence, should consideration be given to enforcing that all jobs should have a due date? Thus if a job is ready to run now, rather than create it with a null due date, create it with a due date set to now() - eg systime. If this was the case, then jobs could be ordered by Due Date such that jobs are processed FIFO.

Whilst I was thinking through this, another idea I had was what if a process was given a priority. Hence if a priority column was added to the job table, then the job select sql could become something like;

Select job
From ACT_RU_JOB
Where ready
Order by Priority_, DUE_DATE_

There's another potential advantage to enforcing that the due date is not null. If there is always a value, then it would be possible to measure latency, ie the difference between when a job is available for processing versus when it was actually run. This would be a useful metric for monitoring and performance tuning.

Now I recognise that use of a priority construct can also lead to starvation, however this would be best governed by usage principles, ie use sparingly and intelligently. In addition, if both priority and due date were enforced, there's nothing stopping development of an alternate dispatch strategy where jobs could be ordered by say priority*(Now() - Due_Date_)) for example...

Thoughts?

regards

Rob

Martin Schimak

unread,
May 28, 2014, 4:32:20 AM5/28/14
to camunda...@googlegroups.com
Hi Rob,

+1 for the analysis… I reported my observation of the behavior you describe to some camunda people at the community day in Prague - and would be interested in any new thoughts etc about that, too.

Many greetings,
Martin.
--
You received this message because you are subscribed to the Google Groups "camunda BPM platform contributors" group.
To unsubscribe from this group and stop receiving emails from it, send an email to camunda-bpm-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/camunda-bpm-dev/079f402b-4179-43a3-a21d-58f995910eae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniel Meyer

unread,
May 28, 2014, 4:36:35 AM5/28/14
to camunda...@googlegroups.com
Hi Guys,

very insterting feedback, thank you for that. Thorben has worked on this
on a branch (some while back). I would be interested to get your
thoughts on that change:

https://github.com/camunda/camunda-bpm-platform/compare/CAM-1214

Cheers,
Daniel

Daniel Meyer

unread,
May 28, 2014, 4:41:04 AM5/28/14
to camunda...@googlegroups.com
The basic Idea was to make JOB.DUE_DATE_ NOT NULL and to add a PRIORITY_

https://github.com/camunda/camunda-bpm-platform/compare/CAM-1214#diff-20

Cheers,
Daniel

thorben....@student.hpi.uni-potsdam.de

unread,
May 28, 2014, 5:31:57 AM5/28/14
to camunda...@googlegroups.com
Hi everyone,

see also the JIRA issue that describes our reasoning at that time in detail and is pretty similar to Rob's ideas: https://app.camunda.com/jira/browse/CAM-1214

I think one of the most uncertain points for me at that time was the possible impact on performance.

Cheers,
Thorben

Rob P

unread,
May 28, 2014, 6:05:47 AM5/28/14
to camunda...@googlegroups.com
Hi Thorben,

Great work and nice analysis! Glad to see some alignment here. A pluggable Job selection strategy may be useful here such that an admin could configure the best job acquisition strategy for their use case. In addition, its reasonable that highly loaded systems would be deployed in a cluster and thus multiple nodes selecting jobs. I could envisage a setup where I could have a set of nodes with a selection strategy based purely on priority and a set of nodes with a more egalitarian strategy eg incorporates the aging strategy. In other words I ensure at least fair throughput, however I have some nodes which can spike for high priority jobs - the best of both worlds!

Another consideration could be that the price of the more sophisticated selection SQL query overhead may be absorbed by multiple nodes in a loaded cluster.

So +1 for realising the pattern in an upcoming release.

regards

Rob


thorben....@student.hpi.uni-potsdam.de

unread,
May 28, 2014, 7:04:21 PM5/28/14
to camunda...@googlegroups.com
Hi Rob,

the setup in a cluster you describe is a good idea. As long as prioritization is not mandatory, adding this feature cannot make job acquisition worse in general. However, prioritization does not come for free as it involves database schema changes etc. so we should be certain it is a useful addition. I think close contact to people interested in and willing to use such a feature could be helpful in order to get to know the use cases, the workloads we are targetting, and of course performance feedback.

But this is only my personal opinion, as I am currently not involved in camunda development. Daniel or Robert Gimbel can make a better statement on what the plan with this feature is.

Best regards,
Thorben

Rob P

unread,
May 28, 2014, 9:45:02 PM5/28/14
to camunda...@googlegroups.com
Hi Thorben,

Spot on! Hence my desire for a pluggable execution strategy. There will be cases where throughput is a priority at the expense of liveness. There will be cases where liveness will be a priority over throughput. In my case in a clustered environment, I may be able to have both via different job selection strategies on each node.

So I guess minimalist changes are;
Enforce Duedate_ to be not null
Add a Priority_ column to the job table and probably a priority attribute in the process definition constructs.
Add an index to the job table based on Priority_, Duedate_ 
Provide a pluggable job execution strategy with a default implementation based on current unordered selection strategy
Support development of alternate job selection strategies

One other very radical thought I had was consider splitting the Job into two parts, a persistent Job part and a volatile part. The volatile part could be the lock owner, lockexpiry, duedate, priority. The non-volatile part would be the primary & foreign keys and status fields. Where I was going with this was, what if the non-volatile part was stored in a disk based table, whereas the volatile part was stored in an in memory table. The assumption here is on startup, the volatile, in memory table is derived from the non-volatile components. If the DB were to crash, and the volatile parts were lost, then at least the job is not lost, it may be rerun much like when an exception occurs. Thus job selection occurs over the in memory table...

Its a radical thought and slightly more complex. There is also an assumption that the in memory performance boost is greater than the overhead of splitting the entity across two constructs...

regards

Rob


Rob P

unread,
Jul 1, 2014, 8:24:50 AM7/1/14
to camunda...@googlegroups.com
Hi Thorben, Daniel et al,

I found this performance analysis and ideas interesting, particularly for job executor performance ( link below)... If only I had more time to rigorously investigate alternatives...



regards

Rob

Rob P

unread,
Jan 11, 2015, 11:37:55 PM1/11/15
to camunda...@googlegroups.com


Hi All,
Ive still been pondering JobExecutor scaling and throughput. As an alternative to job prioritisation, what about if each process definition had a 'class of service' and rather than one job executor per node, each node had a job executor pool with a job executor dedicated to each class. Note: Tomcat based architecture here...

For example, lets assume I have class 1 - low priority, class 2 normal priority and class 3 high priority processes. Hence I configure a job executor pool per node with three job executors running, one dedicated to each class of job. Hence the acquire jobs select statement becomes 'select ready jobs where class = [x]' and x is a parameter associated with the job executor and thus implements affinity between a job executor and its corresponding class.

Hence in a shared clustered engine environment, this effectively partitions the job table into logical partitions based on class of service and thus could reduce optimistic locking contention. In addition, there should be a much lower chance of job starvation as there is less chance of a high priority job being blocked by a normal priority job.

Note, my preference would be to have class of service down at the task granularity rather than process, however I didn't want to get into the complexity of exclusive versus non-exclusive just yet.

Hence any thoughts/feedback on this an a scaling and throughput strategy?

regards

Rob
 

Bernd Rücker (camunda)

unread,
Jan 12, 2015, 10:14:00 AM1/12/15
to camunda...@googlegroups.com

Hi Rob.

 

Sounds completly reasonable for me. Should be even doable out-of-the box with multiple engines and assigning deployments to a specific engine. We had that once that we even introduced own cluster nodes only executing jobs of a high prio process definition… Don’t remember exact figures – but the overall idea worked (at least I haven’t heard anything that it didn’t ;-)).

 

Cheers

Bernd

--

You received this message because you are subscribed to the Google Groups "camunda BPM platform contributors" group.
To unsubscribe from this group and stop receiving emails from it, send an email to camunda-bpm-d...@googlegroups.com.

Rob P

unread,
Jan 12, 2015, 6:12:16 PM1/12/15
to camunda...@googlegroups.com
Hi Bernd,

I agree that this could be realised out of the box via partitioning across multiple engines. The down side of this approach is process definitions may be disbursed across multiple engines and thus the cockpit and tasklist UI experience may be less than optimal.

Hence on my wishlist is to achieve a similar outcome, but with the one logical engine (cluster). Hence if there were an extra column on the job table which stored the job classification, and I could configure the job executor acquisition selector to have affinity with a class of job, then I guess I could configure job executor nodes with affinity to a job classification.

Perhaps this could be the next incremental change to the job executor - add a column possibly called priority. Give each process a default priority of say 100, however allow setting the priority as an extension in the BPMN model. Add two parameters to the default job executor such that the job selection is based on x <= priority <y. Hence by default, x = 0 and y = MAX_INT and thus all job classifications are in scope. However in a cluster, I could configure these ranges such that nodes have affinity with a class of process.

In addition, given the pluggable nature of the job executor, other implementations could use a select ... order by priority, or even select ... order by priority*(now() - duedate) etc.

regards

Rob 

Bernd Rücker (camunda)

unread,
Jan 13, 2015, 9:11:35 AM1/13/15
to camunda...@googlegroups.com

Hi Rob.

 

Actually I think you could also realize it by using different engines pointing to the SAME database (as the job executor is “deployment-aware” and only takes the right jobs). The challenge is to point the clients to the right engine. But you would have one single cockpit in that scenario.

 

But I agree that the job priority would be a valuable addition. Actually we discussed that internally a couple of times already too. But we haven’t yet added that to a concrete point in the roadmap – but that’s more a topic for the core team or Robert to comment on…

Rob P

unread,
Jan 13, 2015, 4:51:32 PM1/13/15
to camunda...@googlegroups.com
Hi Bernd,

You are right - a heterogeneous cluster could effectively partition based on deployments.

Perhaps then, the best architectural approach at the moment could be to partition the engine cluster into two tiers;
Tier 1 which I shall call the client tier is a homogeneous cluster and all client interactions occur through the client tier. Hence any client tier engine node can service any client and thus even Tasklist works fine. Job executors in the client tier engine nodes are turned off.

Tier 2 which I shall call the job executor tier is a heterogeneous cluster whose sole purpose is to run the job executors. No clients should interact with these nodes. Thus your deployment process can effectively determine the partitioning in this tier and thus classes of process deployments can be independently resourced.

I guess the consequence of this architecture is partitioning is dependent on your deployment processes rather than configuration and there may still be contention for job acquisition on the job table.

If this architecture is feasible for scaling out, then a remaining challenges is how to scale out the UI in a shred engine context. In other words, in Tasklist and Cockpit, not all users should have visibility of all processes. Hence in terms of priorities, perhaps enhancing the authorisation infrastructure should be the priority in support of the scaling approach described above.

regards

Rob

Robert Gimbel

unread,
Jun 23, 2015, 8:18:43 AM6/23/15
to camunda...@googlegroups.com
For anyone who is motivated to contribute to this topic, please follow this link to our survey: https://www.surveymonkey.com/r/PXCZVXR

Thanks a lot!
Reply all
Reply to author
Forward
0 new messages