Recurring deadlock with oracle db in homogenous cluster

451 views
Skip to first unread message

marku...@serie-a.de

unread,
Mar 9, 2016, 2:02:31 AM3/9/16
to camunda BPM users
Hi Guys,

I have a deadlock problem in my homogenous cluster.

I have a application with embedded process engine and jobExecutor activated.
This application is deployed in production on 4 tomcats with a load balancer. All nodes share same database. So we have exactly one oracle database.

So at runtime we have 4 job executor active.
One process defines a cancellation of old objects based on a timer event. After the cancelation the process is finished.

There are about 1000 instances every night where timer event occures.

But I get incidents every night based on a ORA-00060 deadlock detected while waiting for resource.

Here is the relevant part of my process engine configuration with spring:

<bean id="processEngineConfiguration" class="org.camunda.bpm.engine.spring.SpringProcessEngineConfiguration">
<property name="dataSource" ref="dataSource" />
<property name="transactionManager" ref="transactionManager" />
<property name="databaseSchemaUpdate" value="doNotCheck" />
<property name="jobExecutorDeploymentAware" value="true"/>
<property name="jobExecutorActivate" value="true" />
<property name="history" value="full" /> <!-- Full history logging -->
<property name="defaultSerializationFormat" value="application/json"/>
<property name="createIncidentOnFailedJobEnabled" value ="true"/>
....
</bean>

I also will add the stack trace of the reccuring exception based on ORA-00060. Maybe some constraints are missing for oracle db ?

Has anyone an idea how this could occurre? Would it be helpful to deactivate job executor on 3 of 4 nodes?

Please let me know if I have to add more information.

Best regards,

Markus

org.camunda.bpm.engine.ProcessEngineException: ENGINE-03004 Exception while executing Database Operation 'DELETE_BULK deleteByteArrayNoRevisionCheck 2043217' with message '
### Error updating database. Cause: java.sql.SQLException: ORA-00060: Deadlock beim Warten auf Ressource festgestellt

### The error may involve org.camunda.bpm.engine.impl.persistence.entity.VariableInstanceEntity.deleteByteArrayNoRevisionCheck-Inline
### The error occurred while setting parameters
### SQL: delete from ACT_GE_BYTEARRAY where ID_ = ?
### Cause: java.sql.SQLException: ORA-00060: Deadlock beim Warten auf Ressource festgestellt
'. Flush summary:
[
INSERT HistoricVariableInstanceEntity[2041453]
INSERT HistoricVariableInstanceEntity[2041454]
INSERT HistoricVariableInstanceEntity[2041455]
INSERT HistoricVariableInstanceEntity[2041456]
INSERT HistoricVariableInstanceEntity[2041457]
INSERT HistoricVariableInstanceEntity[2041458]
INSERT HistoricVariableInstanceEntity[2041459]
INSERT HistoricVariableInstanceEntity[2041460]
INSERT HistoricVariableInstanceEntity[2041461]
INSERT HistoricVariableInstanceEntity[2041462]
INSERT HistoricJobLogEventEntity[2082253]
INSERT HistoricVariableUpdateEventEntity[2082240]
INSERT HistoricVariableUpdateEventEntity[2082241]
INSERT HistoricVariableUpdateEventEntity[2082242]
INSERT HistoricVariableUpdateEventEntity[2082243]
INSERT HistoricVariableUpdateEventEntity[2082244]
INSERT HistoricVariableUpdateEventEntity[2082245]
INSERT HistoricVariableUpdateEventEntity[2082246]
INSERT HistoricVariableUpdateEventEntity[2082247]
INSERT HistoricVariableUpdateEventEntity[2082248]
INSERT HistoricVariableUpdateEventEntity[2082249]
INSERT HistoricActivityInstanceEventEntity[EndEvent_0zan6z8:2082252]
INSERT HistoricActivityInstanceEventEntity[EndEvent_1e1wnb4:2082251]
INSERT HistoricActivityInstanceEventEntity[ExclusiveGateway_1qqayx4:2082250]
INSERT HistoricActivityInstanceEventEntity[startCancelFreightProcessEvent:2082239]
DELETE MessageEntity[2041463]
DELETE MessageEventSubscriptionEntity[1598189]
DELETE MessageEventSubscriptionEntity[1598190]
DELETE VariableInstanceEntity[1598191]
DELETE VariableInstanceEntity[1598192]
DELETE VariableInstanceEntity[1598193]
DELETE VariableInstanceEntity[1598194]
DELETE VariableInstanceEntity[1598195]
DELETE VariableInstanceEntity[1598196]
DELETE VariableInstanceEntity[1598725]
DELETE VariableInstanceEntity[1598727]
DELETE VariableInstanceEntity[1598730]
DELETE VariableInstanceEntity[2041426]
DELETE VariableInstanceEntity[2041453]
DELETE VariableInstanceEntity[2041454]
DELETE VariableInstanceEntity[2041455]
DELETE VariableInstanceEntity[2041456]
DELETE VariableInstanceEntity[2041457]
DELETE VariableInstanceEntity[2041458]
DELETE VariableInstanceEntity[2041459]
DELETE VariableInstanceEntity[2041460]
DELETE VariableInstanceEntity[2041461]
DELETE VariableInstanceEntity[2041462]
DELETE_BULK deleteByteArrayNoRevisionCheck 2043217
DELETE ExecutionEntity[2041452]
DELETE ExecutionEntity[2041424]
DELETE ExecutionEntity[1598188]
UPDATE HistoricActivityInstanceEventEntity[CallActivity_0m2l921:2041425]
UPDATE HistoricProcessInstanceEventEntity[1598188]
UPDATE HistoricProcessInstanceEventEntity[2041452]
]
at org.camunda.bpm.engine.impl.db.EnginePersistenceLogger.flushDbOperationException(EnginePersistenceLogger.java:113)
at org.camunda.bpm.engine.impl.db.entitymanager.DbEntityManager.flushDbOperationManager(DbEntityManager.java:296)
at org.camunda.bpm.engine.impl.db.entitymanager.DbEntityManager.flush(DbEntityManager.java:282)
at org.camunda.bpm.engine.impl.interceptor.CommandContext.flushSessions(CommandContext.java:315)
at org.camunda.bpm.engine.impl.interceptor.CommandContext.close(CommandContext.java:243)
at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:104)
at org.camunda.bpm.engine.spring.SpringTransactionInterceptor$1.doInTransaction(SpringTransactionInterceptor.java:42)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133)
at org.camunda.bpm.engine.spring.SpringTransactionInterceptor.execute(SpringTransactionInterceptor.java:40)
at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:30)
at org.camunda.bpm.engine.impl.jobexecutor.ExecuteJobsRunnable.executeJob(ExecuteJobsRunnable.java:79)
at org.camunda.bpm.engine.impl.jobexecutor.ExecuteJobsRunnable.run(ExecuteJobsRunnable.java:66)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.ibatis.exceptions.PersistenceException:
### Error updating database. Cause: java.sql.SQLException: ORA-00060: Deadlock beim Warten auf Ressource festgestellt

### The error may involve org.camunda.bpm.engine.impl.persistence.entity.VariableInstanceEntity.deleteByteArrayNoRevisionCheck-Inline
### The error occurred while setting parameters
### SQL: delete from ACT_GE_BYTEARRAY where ID_ = ?
### Cause: java.sql.SQLException: ORA-00060: Deadlock beim Warten auf Ressource festgestellt

Christian Lipphardt

unread,
Mar 9, 2016, 2:19:34 AM3/9/16
to camunda-...@googlegroups.com
Hi Markus,

Could you please add what version of Camunda you use?

Cheers,
Christian

signature.asc

Markus Hens

unread,
Mar 9, 2016, 2:44:24 AM3/9/16
to camunda BPM users
Hi christian,

I am using camunda 7.4.0 final.

Best regards

Markus

marku...@serie-a.de

unread,
Mar 9, 2016, 6:28:12 AM3/9/16
to camunda BPM users
Hi Guys,

I recognized that problem also occurre in a non cluster so having only one Node but heavy load for job executor.

Any ideas?

Best regards,

Markus

thorben....@camunda.com

unread,
Mar 9, 2016, 7:09:23 AM3/9/16
to camunda BPM users, marku...@serie-a.de
Hi Markus,

Just a shot into the dark here. Could be this problem: https://app.camunda.com/jira/browse/CAM-5440
You could try adding the index mentioned in the ticket description.

Cheers,
Thorben

marku...@serie-a.de

unread,
Mar 10, 2016, 1:54:43 AM3/10/16
to camunda BPM users, marku...@serie-a.de
Hi Guys,

I was able to fix the problem. Because of a migration error we missed indexes in the camunda database schema in production which were available in development evironment.
Furthermore we did not use the StrongUUIDGenerator as Id generator but now.

Im not sure which of both solutions fixed the problem but in combination everything works fine.

Thanks and best regards,

Markus
Reply all
Reply to author
Forward
0 new messages