JBatch Broken in Payara 5?

190 views
Skip to first unread message

Reza Rahman

unread,
Feb 7, 2021, 9:02:01 PM2/7/21
to Payara Forum
Hi,

I am in the process of upgrading Cargo Tracker to Jakarta EE 8/Payara 5.
Cargo Tracker included a simple batch case that worked perfectly fine in
Java EE 7/Payara 4.

When I try the upgrade, the batch process no longer works and I get this
log message:

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

2021-02-07T20:54:00.037-0500|WARNING: Caught throwable in main execution
loop with Throwable message: null, and stack trace:
java.lang.NullPointerException
    at
com.ibm.jbatch.container.impl.BaseStepControllerImpl.updateBatchStatus(BaseStepControllerImpl.java:287)
    at
com.ibm.jbatch.container.impl.BaseStepControllerImpl.markStepFailed(BaseStepControllerImpl.java:243)
    at
com.ibm.jbatch.container.impl.BaseStepControllerImpl.markJobAndStepFailed(BaseStepControllerImpl.java:248)
    at
com.ibm.jbatch.container.impl.BaseStepControllerImpl.execute(BaseStepControllerImpl.java:133)
    at
com.ibm.jbatch.container.impl.ExecutionTransitioner.doExecutionLoop(ExecutionTransitioner.java:112)
    at
com.ibm.jbatch.container.impl.JobThreadRootControllerImpl.originateExecutionOnThread(JobThreadRootControllerImpl.java:110)
    at
com.ibm.jbatch.container.util.BatchWorkUnit.run(BatchWorkUnit.java:80)
    at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at
org.glassfish.enterprise.concurrent.internal.ManagedFutureTask.run(ManagedFutureTask.java:143)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
    at
org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Is this a known issue? Is there a workaround?

The code is already checked in: https://github.com/eclipse-ee4j/cargotracker

Thanks in advance.

Reza Rahman
Jakarta EE Ambassador, Author, Blogger, Speaker

Please note views expressed here are my own as an individual community
member and do not reflect the views of my employer.

Ondro Mihályi

unread,
Feb 8, 2021, 5:20:27 PM2/8/21
to Reza Rahman, Payara Forum
Hi Reza,

This might be a similar issue as reported in https://github.com/payara/Payara/issues/3776. We couldn't reproduce that problem so we closed the github issue. If you have a better reproducer with the CargoTracker app on the latest Payara Server, you can reopen the issue or create a new github issue. Please describe what steps to do with CargoTracker to reproduce it.

All the best,
Ondro

po 8. 2. 2021 o 3:02 Reza Rahman <m.reza...@gmail.com> napísal(a):
--
You received this message because you are subscribed to the Google Groups "Payara Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to payara-forum...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/payara-forum/db927628-3e9b-d2b9-82bb-ffc5348b4aae%40gmail.com.

steve.millidge

unread,
Feb 9, 2021, 4:25:05 AM2/9/21
to Payara Forum

This looks like it could be a race condition whereby the step is trying to commit before the parent record has committed. As Ondrej mentioned instructions to reproduce would be good. 




m.reza.rahman

unread,
Feb 9, 2021, 11:58:43 PM2/9/21
to Payara Forum
Hi Steve/Ondro,

To be honest, reproducing it is super simple. All you really need to do is run the contents of the master branch on the latest Payara release and Java SE 8. I have tested it on both Windows 10 and Mac Big Sur. The batch process runs every few minutes and each time it runs you will see log output. The execution runs successfully only the very first time. It fails to execute and throws the seemingly lock acquisition error all subsequent times.

Please do let me know if this is not specific enough, I will try to be clearer. If you are able to reproduce the problem, do let me know if there is a workaround. For example, using an older release is fine for now. It's even better if this is actually the result of a bug in the Cargo Tracker code (e.g. maybe we are failing to close some resources correctly).

Appreciate the help in advance!

Reza Rahman
Jakarta EE Ambassador, Author, Blogger, Speaker

Please note views expressed here are my own as an individual community member and do not reflect the views of my employer.

Sent via the Samsung Galaxy S7, an AT&T 4G LTE smartphone
--
You received this message because you are subscribed to the Google Groups "Payara Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to payara-forum...@googlegroups.com.

Ondro Mihályi

unread,
Feb 14, 2021, 6:49:00 AM2/14/21
to m.reza.rahman, Payara Forum
Hi Reza,

I've reproduced it too, even on Linux. The exception you see in the log is not the root exception, it just hides the original exception (it's thrown from the catch block of the original exceptoin). The original exception is a FK constraint validation when inserting job status data to the internal H2 DB. It's possible that this is caused by the switch to using H2 DB instead of Derby DB. You may be successful with Payara Server version, which still uses Derby DB. The last version of Payara Server with Derby DB is 5.194 (this commit removes the Derby DB)

Here's the original exception that I retrieved from the debugger:

com.ibm.jbatch.container.exception.PersistenceException: org.h2.jdbc.JdbcSQLException: Referential integrity constraint violation: "JOBEXEC_STEPEXEC_FK: PUBLIC.STEPEXECUTIONINSTANCEDATA FOREIGN KEY(JOBEXECID) REFERENCES PUBLIC.EXECUTIONINSTANCEDATA(JOBEXECID) (2)"; SQL statement:
INSERT INTO STEPEXECUTIONINSTANCEDATA (jobexecid, batchstatus, exitstatus, stepname, readcount,writecount, commitcount, rollbackcount, readskipcount, processskipcount, filtercount, writeskipcount, starttime,endtime, persistentdata) VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) [23506-196]
at fish.payara.jbatch.persistence.rdbms.JBatchJDBCPersistenceManager.createStepExecution(JBatchJDBCPersistenceManager.java:1688)
at fish.payara.jbatch.persistence.rdbms.JBatchJDBCPersistenceManager.createStepExecution(JBatchJDBCPersistenceManager.java:1631)
at fish.payara.jbatch.persistence.rdbms.LazyBootPersistenceManager.createStepExecution(LazyBootPersistenceManager.java:205)
at com.ibm.jbatch.container.impl.BaseStepControllerImpl.getNewStepExecution(BaseStepControllerImpl.java:416)
at com.ibm.jbatch.container.impl.BaseStepControllerImpl.shouldStepBeExecuted(BaseStepControllerImpl.java:303)
at com.ibm.jbatch.container.impl.BaseStepControllerImpl.execute(BaseStepControllerImpl.java:126)

at com.ibm.jbatch.container.impl.ExecutionTransitioner.doExecutionLoop(ExecutionTransitioner.java:112)
at com.ibm.jbatch.container.impl.JobThreadRootControllerImpl.originateExecutionOnThread(JobThreadRootControllerImpl.java:110)
at com.ibm.jbatch.container.util.BatchWorkUnit.run(BatchWorkUnit.java:80)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at org.glassfish.enterprise.concurrent.internal.ManagedFutureTask.run(ManagedFutureTask.java:143)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)
Caused by: org.h2.jdbc.JdbcSQLException: Referential integrity constraint violation: "JOBEXEC_STEPEXEC_FK: PUBLIC.STEPEXECUTIONINSTANCEDATA FOREIGN KEY(JOBEXECID) REFERENCES PUBLIC.EXECUTIONINSTANCEDATA(JOBEXECID) (2)"; SQL statement:
INSERT INTO STEPEXECUTIONINSTANCEDATA (jobexecid, batchstatus, exitstatus, stepname, readcount,writecount, commitcount, rollbackcount, readskipcount, processskipcount, filtercount, writeskipcount, starttime,endtime, persistentdata) VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) [23506-196]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
at org.h2.message.DbException.get(DbException.java:179)
at org.h2.message.DbException.get(DbException.java:155)
at org.h2.constraint.ConstraintReferential.checkRowOwnTable(ConstraintReferential.java:371)
at org.h2.constraint.ConstraintReferential.checkRow(ConstraintReferential.java:313)
at org.h2.table.Table.fireConstraints(Table.java:980)
at org.h2.table.Table.fireAfterRow(Table.java:998)
at org.h2.command.dml.Insert.insertRows(Insert.java:161)
at org.h2.command.dml.Insert.update(Insert.java:114)
at org.h2.command.CommandContainer.update(CommandContainer.java:101)
at org.h2.command.Command.executeUpdate(Command.java:260)
at org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:164)
at org.h2.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:150)
at com.sun.gjc.spi.base.PreparedStatementWrapper.executeUpdate(PreparedStatementWrapper.java:127)
at jdk.internal.reflect.GeneratedMethodAccessor94.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at com.sun.gjc.spi.jdbc40.ProfiledConnectionWrapper40$1.invoke(ProfiledConnectionWrapper40.java:457)
at com.sun.proxy.$Proxy378.executeUpdate(Unknown Source)
at fish.payara.jbatch.persistence.rdbms.JBatchJDBCPersistenceManager.createStepExecution(JBatchJDBCPersistenceManager.java:1677)
... 15 more

st 10. 2. 2021 o 5:58 m.reza.rahman <m.reza...@gmail.com> napísal(a):

m.reza.rahman

unread,
Feb 14, 2021, 12:11:31 PM2/14/21
to Ondro Mihályi, Payara Forum
OK, great. So I assume you will register this as an issue to fix? For now, Cargo Tracker could move forward with 5.194 and register this as a known issue.

David Sharpe

unread,
Feb 14, 2021, 12:14:35 PM2/14/21
to m.reza.rahman, Ondro Mihályi, Payara Forum
Just to be clear, I think Reza is saying that this failure represents a failure to implement the JBatch spec properly? Not saying that’s true, just trying to resolve ambiguity.

Regards,
David 

Ondro Mihályi

unread,
Feb 17, 2021, 4:06:47 AM2/17/21
to David Sharpe, m.reza.rahman, Payara Forum
I found out what's the problem and also a workaround.

The issue only happens when the batch job is started from a transaction (e.g. from an EJB). This is the case in the CargoTracker project, which starts the job from a timer in a stateless EJB. The H2 database connector expects that the connection commits automatically (autocommit=true), which is the default. But when in transaction, Payara Server sets autocommit to false so that all is committed at the end of the transaction. 

A workaround is to either start the job in background, using a managed executor service, or to disable managed transactions for the Stateless EJB with TransactionManagementType.BEAN.

I'll raise an issue on github where I'll explain the details.

All the best,
Ondro

ne 14. 2. 2021 o 18:14 David Sharpe <david.and...@gmail.com> napísal(a):

Steve Millidge (Payara)

unread,
Feb 17, 2021, 6:00:23 AM2/17/21
to Ondro Mihályi, David Sharpe, m.reza.rahman, Payara Forum

Yeah this will be a race condition, the outer transaction inserting the job ID is not being committed before the inner transaction inserting the step record (referencing the job ID) is committed. This could always be the case and is not a bug per-se when the step is on a different thread to the job.

Reza Rahman

unread,
Feb 17, 2021, 9:47:19 AM2/17/21
to Payara Forum

OK. I will adjust Cargo Tracker for now. Is it worth it to check to see what the Batch specification actually expects when a job is started within an existing managed transaction? I have to think that is a very common scenario.

steve.millidge

unread,
Feb 17, 2021, 10:14:01 AM2/17/21
to Payara Forum
I have just checked and it doesn't say much.

It says all transactions in JBatch are global transactions and it says Chunks are executed in their own transactions. Not much more than that. We have two transactions (the EJB and the Chunk) so that bit is correct and I think the transactions are racing.

tbh we still use the core JBatch RI implementation and haven't changed code in this area so I think it is just the database engine change that is impacting the order transactions are being committed. The JBatch RI JDBC persistence manager just uses getConnection so will enlist a database connection in any global TX that's hanging around including the EJB one.

Steve 



Ondro Mihályi

unread,
Feb 17, 2021, 12:11:38 PM2/17/21
to steve.millidge, Payara Forum
Steve is right, the Batch spec doesn't say much about transactions. It only says that the chunk checkpoints should run in the global transaction. This could be extended to the initial checkpoint before any chunk is processed, but that goes against the fact that JobOperator.start executes the job asynchronously in a different transaction, even before the global Tx commits.

As Steve says we didn't change anything in the Java code except replacing Derby with H2.

I guess to fix this even for managed transactions, the creation of the job should run on the same thread as the first step. The current implementation creates the job on the same thread, within the existing Tx and then starts the first step in background. It could instead run everything in the background. But that would require a change in the JBatch RI.

Ondro

st 17. 2. 2021 o 16:14 steve.millidge <steve.m...@payara.fish> napísal(a):
--
You received this message because you are subscribed to the Google Groups "Payara Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to payara-forum...@googlegroups.com.

m.reza.rahman

unread,
Feb 17, 2021, 12:23:34 PM2/17/21
to Ondro Mihályi, steve.millidge, Payara Forum
It seems to me that the specification and the (former) RI need a bit of tightening up.

Sent via the Samsung Galaxy S7, an AT&T 4G LTE smartphone


-------- Original message --------
From: Ondro Mihályi <ondrej....@gmail.com>
Date: 2/17/21 12:11 PM (GMT-05:00)
To: "steve.millidge" <steve.m...@payara.fish>
Cc: Payara Forum <payara...@googlegroups.com>
Subject: Re: [payara-forum] JBatch Broken in Payara 5?

steve.millidge

unread,
Feb 17, 2021, 12:36:30 PM2/17/21
to Payara Forum
I suppose the question is (and this is not a criticism of the Cargotracker code);

Why does the developer want to put a global transaction around JobOperator.start() and is the expectation that the job only starts at commit and is not started on a roll back? That isn't specified. I think in the Cargotracker case a global TX is started inadvertently which is very common with EJBs.

Steve


Reza Rahman

unread,
Feb 17, 2021, 2:04:28 PM2/17/21
to Payara Forum

The issue is that I really wasn't sure how Batch handled transactions. So I just put a transaction around the whole job just in case. Semantically, the expectation was that the entire batch process run would be atomic, so if it should fail in the end unexpectedly, everything would fail and avoid some kind of weird intermediate processing state.

That said, it is definitely worth revisiting the code. I don't remember exactly what the processing intent was now and there is file I/O involved too. Maybe atomicity in this case does not matter at the job level so an umbrella transaction is really not appropriate.

All this aside, of course there is the question whether that implicit EJB assumption that everything should be transactional by default is really correct.

I'll keep you posted here as to what I wind up doing with the code, even if temporarily.

Ondro Mihályi

unread,
Feb 17, 2021, 4:09:28 PM2/17/21
to Reza Rahman, Payara Forum
I also found this in the Jakarta EE tutorial, which is similar to what the spec says but in slightly different words:

Every chunk is processed in a global Jakarta EE transaction. If the processing of one item in the chunk fails, the transaction is rolled back and no processed items from this chunk are stored.

That suggests that Batch job only runs individual chunks in a global transaction, which is used only for that chunk. It looks like it doesn't mean that it should attach to an existing transaction. Moreover, our issue is that job creation and step execution run in different transactions that commit in random order. The spec doesn't imply that the job creation needs to run in a global transaction or even in the existing transaction. So I think the behavior of the RI and Payara Server is an implementation detail.

In my opinion, the batch job should behave in the same way whether it's started from within a transaction or not. The job should be created in a separate transaction (nested or on a different thread) and then the first step should be executed asynchronously. I think that the RI just didn't expect that the job would be started in a managed transaction.

Another issue I have with Jakarta EE is that some functionality which isn't directly related to EJBs (like timers) is only supported in EJBs and thus by default starts a transaction. This is similar to what Steve said too. All such functionality should be extracted from the EJB spec so that it can be used e.g. from CDI. If it was possible to use @Schedule in any CDI bean, you wouldn't need to put it into a stateless EJB and it wouldn't start a transaction. Until then, the best practice would be to disable managed transactions for the stateless bean because the job doesn't need to be started in a transaction.

Ondro

st 17. 2. 2021 o 20:04 Reza Rahman <m.reza...@gmail.com> napísal(a):

Ondro Mihályi

unread,
Feb 17, 2021, 5:38:42 PM2/17/21
to Reza Rahman, Payara Forum
To track this, I've raised a github issue with a simple and reliable reproducer. However, I think this should be fixed in the RI.

As I wrote, the best solution for CargoTracker now is to use @TransactionManagement(TransactionManagementType.BEAN) on the stateless EJB that defines the timer method.

Ondro

st 17. 2. 2021 o 22:09 Ondro Mihályi <ondrej....@gmail.com> napísal(a):

m.reza.rahman

unread,
Feb 22, 2021, 5:51:58 PM2/22/21
to Payara Forum
FYI I have made the change for now and Cargo Tracker seems to work with all versions of Payara again.

I will revisit the code again soon and also do a bit more research. If I still feel something is amiss I will reopen a conversation here or start one on the Batch specification alias.

Thanks a lot for the prompt engagement, I really appreciate it.
Reply all
Reply to author
Forward
0 new messages