Handling Transactions and Batch Execution with Dependency Injection

1,024 views
Skip to first unread message

Josh Padnick

unread,
Jul 17, 2014, 1:15:13 PM7/17/14
to jooq...@googlegroups.com

Hello.  We're using Dependency Injection (Google Guice) in our app and would like to define a method like this:

public myMethod() {
   TransactionModule.startTransaction();
module1.insertSomething();
module2.insertSomethingElse(); TransactionModule.endTransaction();
}

The idea is each module has its own jOOQ code that does a database insert.

As these are independent modules, it makes sense to separate them, but (1) I'd like to assert a transaction at one level above the modules since the modules themselves can't know if they're part of a transaction, and (2) I'd like to leverage JDBC Batch Statements to improve efficiency.

I posted about the JDBC Batch Statement part on the jOOQ documentation site, and Lukas (kindly) responded that I seemed to be suggesting a new feature.  Specifically, I proposed there that maybe jOOQ could allow setting some kind of flag to let it know I'm still queuing up statements, but it appears to only the create.batch() method is supported now.

It looks like JDBC Batch Statements are a big performance boost (http://java.dzone.com/articles/what-you-didnt-know-about-jdbc), so it would be nice to incorporate these, but how best to do this in a Dependency Injection paradigm?

So, here are my official questions to the most excellent jOOQ community:

JDBC Batch Statements

  1. What is the best way to combine Dependency Injection where individual modules execute different database operations with JDBC Batch Statements?
Transactions
  1. What is the best way to use transactions with Dependency Injection, again where individual modules execute JDBC operations without knowledge they may be participating in a transaction.

Thank you all for the input!

Lukas Eder

unread,
Jul 18, 2014, 5:34:11 AM7/18/14
to jooq...@googlegroups.com
So, here are my official questions to the most excellent jOOQ community:

Folks, that's us! Thanks for the flattering ;-)

Please, group, do participate in this very interesting discussion!

For the reference, this is the manual page and the Disqus discussion that Josh was referring to:

This is a very interesting topic, and I've personally evaluated doing similar things in a previous contracting engagement with Adobe, for Apache Jackrabbit. Jackrabbit was issuing tons of similar SQL statements (if backed by a SQL store) while traversing the JCR tree. So I figured that in some cases, I can simply collect all JDBC statements into sets of batches like so (Java 8 syntax for improved readability):

// "Regular code":
// ---------------
module1.insertSomething();
module2.insertSomethingElse();

// The above might generate... (each line is a statement)
// INSERT INTO something (A, B) VALUES (?, ?)
// INSERT INTO something (A, B) VALUES (?, ?)
// INSERT INTO something (A, B) VALUES (?, ?)
// INSERT INTO something (A, B, C) VALUES (?, ?, ?)
// INSERT INTO something (A, B, C) VALUES (?, ?, ?)
// INSERT INTO something (A, B) VALUES (?, ?)
// INSERT INTO something_else (X, Y) VALUES (?, ?)
// INSERT INTO something_else (X, Y) VALUES (?, ?)
// INSERT INTO something_else (X, Y) VALUES (?, ?)

// "Batch-collecting code":
BatchCollector.collectAndExecute(() -> {
    module1.insertSomething();
    module2.insertSomethingElse();
});

// The above might now generate... (each line is a batch statement)
// INSERT INTO something (A, B) VALUES (?, ?)       -- With 3 calls to PreparedStatement.addBatch()
// INSERT INTO something (A, B, C) VALUES (?, ?, ?) -- With 2 calls to PreparedStatement.addBatch()
// INSERT INTO something (A, B) VALUES (?, ?)       -- With 1 calls to PreparedStatement.addBatch() or perhaps this won't generate a batch
// INSERT INTO something_else (X, Y) VALUES (?, ?)  -- With 2 calls to PreparedStatement.addBatch()

After hacking about 2 hours, this worked very well, failing only around 1% of JackRabbit integration tests. The relevant code sections were speeded up considerably.

On a JDBC level, the rules were simple:

1. Each time a new SQL text is encountered, the previous batch (if any) is executed
2. Each time the same SQL text is encountered, the statement is not executed, but the bind values were collected
3. When the transaction ended, the last (unexecuted) batch is executed

Regarding your questions:

JDBC Batch Statements
  1. What is the best way to combine Dependency Injection where individual modules execute different database operations with JDBC Batch Statements?
The modules would need to have a "standard" access to a Connection or DataSource in order to generate batch statements transparently. With jOOQ, this would be the DSL.using(configuration) idiom, from which all statements can be constructed / executed. Such a Connection or DataSource would act as a proxy to the real JDBC driver and encapsulate all collection logic in there.
 
Transactions
  1. What is the best way to use transactions with Dependency Injection, again where individual modules execute JDBC operations without knowledge they may be participating in a transaction.

With Spring, transactions are indeed handled transparently, so I suspect that transactions are not part of the game here. However, you must ensure that whenever you enter a "batch collecting code section", all your "modules" will access the proxied Connection and/or DataSource, instead of the "real" one. Anyone who can access the "real" DataSource will unexpectedly insert" statements before a batch, instead of triggering batch execution. Consider the following set of SQL statements being executed from your "modules":

// INSERT INTO something (A, B) VALUES (?, ?)       - Executed through proxy DS
// INSERT INTO something (A, B) VALUES (?, ?)       - Executed through proxy DS
// INSERT INTO something (A, B, C) VALUES (?, ?, ?) - Executed through original DS

As the last statement is executed through the original DS, it will be flushed to the database server immediately, resulting in this behaviour:

// INSERT INTO something (A, B, C) VALUES (?, ?, ?) - Executed directly
// INSERT INTO something (A, B) VALUES (?, ?)       - Executed as 2 batch statements

As I said before, this is a very interesting discussion. I have added a feature request for this kind of feature to the jOOQ roadmap:

In jOOQ, while within such a collectAndExecute() method, the jOOQ Configuration could change its internal ConnectionProvider easily to collect all statements into batches. This would obviously have impacts on the application design, as users will need to operate only on this modified jOOQ Configuration.

In the mean time, you could already experiment with the above on a JDBC level and make this completely transparent to jOOQ.

What do others think?

Cheers,
Lukas

Roger Thomas

unread,
Jul 19, 2014, 7:32:41 AM7/19/14
to jooq...@googlegroups.com
With the work you have done in JOOQ we now have a way to express the transaction as create.transaction (ctx -> { work to do within transaction } )

So something like create.batch (ctx -> { work to do within batch } ) would be a nice logical extension.

One thing you would have to provide a lot of information to the log regarding what statements are being added to a batch and which are being executed outside of a batch, as such detail would be needed to find issues such as you detail in your example regarding the reordering of inserts.

Roger

Lukas Eder

unread,
Jul 21, 2014, 5:59:44 AM7/21/14
to jooq...@googlegroups.com
Hi Roger,

2014-07-19 13:32 GMT+02:00 Roger Thomas <rithom...@gmail.com>:
With the work you have done in JOOQ we now have a way to express the transaction as create.transaction (ctx -> { work to do within transaction } )

So something like create.batch (ctx -> { work to do within batch } ) would be a nice logical extension.

True, I had actually thought of that comparison right after having sent the previous E-Mail. I'll fix the GitHub issue to reflect your suggested API notation.
So, are you already using this new jOOQ 3.4 interface with Java 8?
 
One thing you would have to provide a lot of information to the log regarding what statements are being added to a batch and which are being executed outside of a batch, as such detail would be needed to find issues such as you detail in your example regarding the reordering of inserts.

Yes, that's right. The current LoggerListener would need to be enhanced to reflect that.

Roger Thomas

unread,
Jul 21, 2014, 5:02:47 PM7/21/14
to jooq...@googlegroups.com


On Monday, July 21, 2014 10:59:44 AM UTC+1, Lukas Eder wrote:
Hi Roger,

So, are you already using this new jOOQ 3.4 interface with Java 8?
 
Its better to say that I'm playing with jOOQ and Java 8. The client of mine who is looking at jOOQ (via my recommendation) is more likely to start within a Java 7 world as other components of their design are currrently limited to Java 7, but that may change over the life of the project.

One thing, while it would be possible to nest batches within transactions can you explicitly make sure that a transaction can not be nested within a batch, unless you plan to correctly emulate the removal of entries within your batch based on the commiting or rollback of transactions that take place within the batch (does that make sense!).

Roger

Lukas Eder

unread,
Jul 22, 2014, 4:24:04 AM7/22/14
to jooq...@googlegroups.com
2014-07-21 23:02 GMT+02:00 Roger Thomas <rithom...@gmail.com>:


On Monday, July 21, 2014 10:59:44 AM UTC+1, Lukas Eder wrote:
Hi Roger,

So, are you already using this new jOOQ 3.4 interface with Java 8?
 
Its better to say that I'm playing with jOOQ and Java 8. The client of mine who is looking at jOOQ (via my recommendation) is more likely to start within a Java 7 world as other components of their design are currrently limited to Java 7, but that may change over the life of the project.

Fair enough. From a tooling perspective, I cannot recommend upgrading to Java 8 yet. Both Eclipse and IntelliJ still have quite a few bugs - which may be annoying in larger projects. We only use Java 8 for integration testing, where the issues are still bearable.
 
One thing, while it would be possible to nest batches within transactions can you explicitly make sure that a transaction can not be nested within a batch, unless you plan to correctly emulate the removal of entries within your batch based on the commiting or rollback of transactions that take place within the batch (does that make sense!).

I haven't done all the thinking through possible combinations yet, but this is certainly worth thinking about. Some example options:

Batch within batch

1) This could be disallowed
2) The nested batch could be simply ignored - contributing statements to the outer batch
3) The nested batch could generate a new, independent "batch wrapper" - inserted at the right spot in the outer batch

In both cases, exceptions thrown from the nested batch would "rollback" that nested batch to a "savepoint" that was placed at the beginning of the nested batch

Transaction within batch

a) This could be disallowed
b) This could emit a "transaction wrapper" to the batch collector, which will execute the (nested?) transaction semantics only once the outer batch is executed. Given that we're in "batch mode", this could then batch the transaction contents separately (consistent with option 3) above)

For the reference: Transaction within transaction

i) The concrete TransactionProvider can disallow it
ii) The DefaultTransactionProvider creates a Savepoint at the beginning of the nested transaction, rolling back to that Savepoint on exceptions
iii) Custom TransactionProviders could implement other propagation semantics, e.g. "autonomous transaction"


I think that once more, it is clear that the key to success will be an overridable SPI with a default implementation for all of the above, which makes sense when using plain JDBC Connections. My current preference for defaults is: 
3) + b) + ii)

Cheers,
Lukas
Reply all
Reply to author
Forward
0 new messages