Long running service tasks are started multiple times?

tosc...@googlemail.com

unread,

Jun 18, 2014, 9:48:22 AM6/18/14

to camunda-...@googlegroups.com

Hi,

in the following scenario the job engine starts a service task twice, why?

Scenario: A simple process model contains a single service task which is marked with "async continuation" (see http://camunda.org/share/#/process/5d11f950-186d-4040-b4d9-bc238a9e5ebb). The service task takes more than five minutes to execute, e.g. "Thread.sleep(6*60*1000);".

Actual behaviour on 7.1.0-Final:
The service task is started and starts running correctly. After exactly five minutes the job engine starts the service task a second time (why?) which then results in an OptimisticLockingException as soon as the first execution of the task is finished.

Is this the expected behaviour? Why is the service task started a second time? Which timeout of five minutes is responsible for this behaviour?

The correct solution for such long running tasks would probably be to model a "Send Task" which starts an asynchronous execution, e.g. in a thread, which in turn is correlated after completion with a subsequent "Receive Task".

However, the behaviour I described above seems kind of strange.

Regards
Tobias

nico s

unread,

Jun 18, 2014, 10:32:16 AM6/18/14

to camunda-...@googlegroups.com, tosc...@googlemail.com

I have the same problem. A workaround is to unmark "async continuation", but I cannot see state in cockpit console.

Is there a solution?

Thanks

webcyberrob

unread,

Jun 18, 2014, 10:53:27 AM6/18/14

to camunda-...@googlegroups.com, tosc...@googlemail.com

Hi guys,

This is the behaviour I would expect. The reason why is the behaviour of the job executor:

The job executor will 'see' the async continuation as a job in the job queue. Hence the job executor will lock the job for execution with a lease which lasts 5 mins by default. The job will then be allocated to a thread within the job executor thread pool.

The purpose of the lock is to support multiple concurrent job executors. Hence each job executor uses its own identifier to lock jobs and this prevents the job executors tripping over each other. Under normal conditions, the job is expected to be completed before the lease expires. Hence if the job completes it would be removed from the job table. If the job takes longer than the lease time, then a job executor will deem the lease expired and thus invalid and thus attempt to execute the job again. This behaviour works well if a job executor crashes, the job will eventually get picked up in a future job cycle.

So the behaviour you are seeing is the firts thread is tied up in your job and when the lease expires after five minutes, a second thread picks up the job. The first thread will eventually finish subsequently followed by the second thread which will lead to the optimistic locking exception.

Your options are;

Break your long running task into a task to initiate processing outside the engine, followed by a message receive task to accept a callback when the long running task has completed.

Change the config of the job executor to increase the lease expiry time. On Tomcat, I believe you can do this via a JMX management bean...

Option 1 would be my preferred pattern...

regards

Rob

nico s

unread,

Jun 19, 2014, 3:22:07 AM6/19/14

to camunda-...@googlegroups.com, tosc...@googlemail.com

Hi,

I have solved increasing the lockTimeInMillis from 300000 to 900000, thus from 5 minutes to 15 minutes. This property can be set in bpm-platform.xml in ${catalina_home}/conf if you use shared engine. This is the snippet code:

<job-executor>

<job-acquisition name="default">

</properties>

</job-acquisition>

</job-executor>

I hope this can help you.

Regards.

Nico-

Daniel Meyer

unread,

Jun 19, 2014, 3:23:30 AM6/19/14

to camunda-...@googlegroups.com, tosc...@googlemail.com

Yes, Rob is right!

Some additional context information: In an ACID environment you do not want long running transactions since they will lead to locks being held in the database for a long time and thus increasing the probability for deadlocks.

You should set Job Executor Lock time to a value < than your transaction timeout. Then you are on the safe side.

tosc...@googlemail.com

unread,

Jun 20, 2014, 2:18:05 AM6/20/14

to camunda-...@googlegroups.com, tosc...@googlemail.com

Thanks for all your detailed responses. The behaviour of the job engine now comprehensible.

Regards
Tobias

Reply all

Reply to author

Forward