Thread-safety: One DSLContext with one JDBC-Connection by multiple users.

47 views
Skip to first unread message

Thorsten Schöning

unread,
Jun 5, 2020, 5:44:57 AM6/5/20
to jOOQ User Group
Hi all,

I would just like to make sure that I'm not understanding things
wrongly, so here's my current setup:

Some backend exposes web services and has some layer of management
functions providing configs etc., including a pair of a DSLContext
using a JDBC-connection per request. Each request has its own pair of
DSLContext and JDBC-connection, though. Concepts like transactions are
bound to the scope of one entire request by default.

> return DSL.using
> (
> this.jdbc, SQLDialect.valueOf(dialect),
> new JooqSettingsLoader(this.config).load()
> );

Am I correct that this approach is at least safe to use with one
thread processing one request?

Am I correct that this approach is still safe when the request spawns
additional threads to query the database and all those threads use the
same DSLContext, same JDBC-connection etc.?

Transactions are NOT managed by those individual threads and
statements won't be reused as well. So it reads like from a JDBC point
of view, reusing the connection is safe:

https://docs.oracle.com/javadb/10.8.3.0/devguide/cdevconcepts89498.html

Resuing a DSLContext seems to be sage as well:

>> Can I use the singleton pattern to use it?
> Yes, that's the recommended approach

https://github.com/jOOQ/jOOQ/issues/7410#issuecomment-380454662

OTOH, there are some old discussions on the mailing list that make me
wonder if I'm wrong:

https://groups.google.com/forum/#!topic/jooq-user/VK7KQcjj3Co

The reason I'm somewhat sure to need to share one connection is that I
have queries in the backend providing results using temporary tables
within the scope of one request. This way different queries within the
same request, though basing data on each other, don't necessarily
re-execute the same queries over and over again. But that only works
with temp-tables using a shared connection. Multiple threads are
useful, because not all queries are based on each other if they are, I
implemented some locking to only create temp-tables ones on my own.

So is there something fundamentally wrong in my assumptions? Thanks!

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning E-Mail: Thorsten....@AM-SoFT.de
AM-SoFT IT-Systeme http://www.AM-SoFT.de/

Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow

Lukas Eder

unread,
Jun 8, 2020, 11:41:54 AM6/8/20
to jOOQ User Group
jOOQ did the-right-thing™ here (as always 😁).

JDBC Connections cannot be shared among threads.
JDBC DataSources can be, if they're implemented accordingly by connection pools.

jOOQ wraps the above. Stay within the thread-bound transaction model exposed by e.g. Spring or JavaEE, and then, there isn't any magic added by jOOQ.

On Fri, Jun 5, 2020 at 11:44 AM Thorsten Schöning <tscho...@am-soft.de> wrote:
Hi all,

I would just like to make sure that I'm not understanding things
wrongly, so here's my current setup:

Some backend exposes web services and has some layer of management
functions providing configs etc., including a pair of a DSLContext
using a JDBC-connection per request. Each request has its own pair of
DSLContext and JDBC-connection, though. Concepts like transactions are
bound to the scope of one entire request by default.

That's great.
 
> return DSL.using
> (
>       this.jdbc, SQLDialect.valueOf(dialect),
>       new JooqSettingsLoader(this.config).load()
> );

Am I correct that this approach is at least safe to use with one
thread processing one request?

I don't know. What are these things?
 
Am I correct that this approach is still safe when the request spawns
additional threads to query the database and all those threads use the
same DSLContext, same JDBC-connection etc.?

What's "this approach"?

... I mean, don't share the JDBC connection among threads, unless you're pooling the connection. But let the connection pool handle this for you (e.g. HikariCP). Or Spring, etc.
 
Transactions are NOT managed by those individual threads and
statements won't be reused as well. So it reads like from a JDBC point
of view, reusing the connection is safe:

https://docs.oracle.com/javadb/10.8.3.0/devguide/cdevconcepts89498.html

Are you using Derby? I'm not sure about Derby, but yes, you can re-use a pooled connection *sequentially*, but no *in parallel*. In most cases, two threads should never share a JDBC Connection.
 
Resuing a DSLContext seems to be sage as well:

>> Can I use the singleton pattern to use it?
> Yes, that's the recommended approach

https://github.com/jOOQ/jOOQ/issues/7410#issuecomment-380454662

Yes, because the DSLContext just wraps the Configuration, and the Configuration just wraps your various listeners, settings, SQL dialect, and your JDBC DataSource. If it wraps a JDBC Connection, then it inherits the Connection's thread-unsafety. If it wraps a DataSource, then it inherits the DataSource's thread-safety.
 
The reason I'm somewhat sure to need to share one connection is that I
have queries in the backend providing results using temporary tables
within the scope of one request. This way different queries within the
same request, though basing data on each other, don't necessarily
re-execute the same queries over and over again. But that only works
with temp-tables using a shared connection. Multiple threads are
useful, because not all queries are based on each other if they are, I
implemented some locking to only create temp-tables ones on my own.

So is there something fundamentally wrong in my assumptions? Thanks!

Yes, it seems to me there's something fishy here. What you seem to want to do is some asynchronous service that caches data in some particular cases. I've seen this being solved using a reactive model where some Flux (from the reactor library) produced the cached / buffered data for other services. This was quite elegant, but perhaps overkill in your case. Maybe, a simple process / thread that can synchronize on buffered query results would be sufficient here in your case. In any case, don't share the Connection itself among threads, it is not designed for that.

Thorsten Schöning

unread,
Jun 8, 2020, 11:43:31 AM6/8/20
to jOOQ User Group
Guten Tag Thorsten Schöning,
am Freitag, 5. Juni 2020 um 11:44 schrieben Sie:

> Transactions are NOT managed by those individual threads and
> statements won't be reused as well. So it reads like from a JDBC point
> of view, reusing the connection is safe:

> https://docs.oracle.com/javadb/10.8.3.0/devguide/cdevconcepts89498.html

While I still have the opinion that reusing a connection is safe, this
use case doesn't provide concurrent execution of statements like I
originally assumed. Instead, the JDBC-driver most likely synchronizes
on the underlying connection, serializing concurrent exec this way.

https://stackoverflow.com/a/52798543/2055163
https://github.com/pgjdbc/pgjdbc/blob/ecbc48ff6ad293a2555465d4f7fb6a33a403af26/pgjdbc/src/main/java/org/postgresql/core/v3/QueryExecutorImpl.java#L276
https://www.postgresql.org/message-id/1102807465.20200608101409%40am-soft.de

Thorsten Schöning

unread,
Jun 8, 2020, 12:51:01 PM6/8/20
to jOOQ User Group
Guten Tag Lukas Eder,
am Montag, 8. Juni 2020 um 17:41 schrieben Sie:

> JDBC Connections cannot be shared among threads.[...]

Is that more like a suggestion based on your experience or a hard
technical fact?

I didn't find hard facts yet, only advises and arguments that I do
understand in a away that sharing is at least partly supported.

The docs I provided read to me very JDBC-generic, not Derby specific.
Others use shared connections as well, it's only slower than
originally assumed because of synchronization.

https://stackoverflow.com/a/52798543/2055163

I'm sharing connections right now in my app and things work, with the
one caveat that performance is not as originally expected. So I don't
get really concurrent execution of independent statements using the
same connection. But things don't arbitrarily fail as well.

> Are you using Derby?

No, I had the feeling those docs are for JDBC in general.

> I'm not sure about Derby, but yes, you can re-use a
> pooled connection *sequentially*, but no *in parallel*. In most cases, two
> threads should never share a JDBC Connection.

There's the following sentence in the docs I provided:

> Committing or rolling back a transaction closes all open ResultSet
> objects and currently executing Statements, unless you are using
> held cursors.

https://docs.oracle.com/javadb/10.8.3.0/devguide/cdevconcepts89498.html

"ResultSet" and "Statement" is referred to in plural. Why is that not
implying parallel usage of the underlying connection? The JDBC-drivers
implement synchronization for a reason most likely as well. The linked
docs don't forbid concurrent usage as well and don't explicitly advise
to only share sequentially.

Thorsten Schöning

unread,
Jun 8, 2020, 1:09:30 PM6/8/20
to jOOQ User Group
Guten Tag Lukas Eder,
am Montag, 8. Juni 2020 um 17:41 schrieben Sie:

> Yes, it seems to me there's something fishy here. What you seem to want to
> do is some asynchronous service that caches data in some particular cases.

Things are easier than it might read: Consider ONE SOAP-request
containing the commands to render THREE different reports: rep1, rep2,
rep3. rep1 and rep2 are totally unrelated, rep3 is based on the data
of rep1.

There's currently no way to model dependencies and stuff between those
reports, I simply only know that some are based on each other's data
and that is hard-coded were necessary. So the implementation of rep3
executes data-retrieving of rep1 and does a bit more.

To not need to rexecute the query of rep1 entirely always, it simply
creates a temporary table and fetched from that. When executed because
of rep3, that way results are instantly available. But this only works
if one and the same connection is reused, because temporary tables are
session-private.

Because rep2 is totally unrelated to all other reports, it would be
best to execute that in parallel with processing of all other reports.
Because of the temp-table-approach I need to keep the same
JDBC-connection, because I can't know if rep2 is totally unrelated or
based on rep1 as well etc.

So I've researched a bit on that topic and came to the conclusion that
this should work. Now I know that because of synchronization in
JDBC-drivers it might not be as performant as expected, but that's
different from that it's not supposed to work at all. :-)

While you are not the only one telling so, others thought of the same
approaches like I had in mind and their tests succeeded as well.
Besides the performance aspect of course. So I wonder what is correct,
does/should it work or not.

I'm confused...

Lukas Eder

unread,
Jun 8, 2020, 1:36:39 PM6/8/20
to jOOQ User Group
On Mon, Jun 8, 2020 at 6:51 PM Thorsten Schöning <tscho...@am-soft.de> wrote:
Guten Tag Lukas Eder,
am Montag, 8. Juni 2020 um 17:41 schrieben Sie:

> JDBC Connections cannot be shared among threads.[...]

Is that more like a suggestion based on your experience or a hard
technical fact?

Spring and Java EE mostly implement thread bound transaction models, where you wouldn't get a Connection from another thread anyway.

I must admit, I'm not sure if there's a hard requirement from the JDBC *spec* for vanilla JDBC Connections not to be shared among threads. It's just something that I've personally never seen yet.

There's the following sentence in the docs I provided:

> Committing or rolling back a transaction closes all open ResultSet
> objects and currently executing Statements, unless you are using
> held cursors.

https://docs.oracle.com/javadb/10.8.3.0/devguide/cdevconcepts89498.html

"ResultSet" and "Statement" is referred to in plural. Why is that not
implying parallel usage of the underlying connection? The JDBC-drivers
implement synchronization for a reason most likely as well. The linked
docs don't forbid concurrent usage as well and don't explicitly advise
to only share sequentially.

Well, there's a difference between executing statements concurrently and consuming resources concurrently. Also, there's a difference between doing things "concurrently" *on the server side* and *on the client side*. Furthermore, what does "concurrent" even mean in terms of resource consumption? Is it really possible to consume two results sets at the same time, or are the two consumptions serialized in some way? Ultimately, JDBC is a network protocol abstraction. Each JDBC call corresponds to a message sent to the server, and a response retrieved back from the server. For example, ResultSet.next() will, if the next row isn't already cached in the client, send some sort of "fetch" message to the server, fetching the next batch of rows from some "cursor" resource, i.e. a buffer that is filled by some "query execution" resource.

So, indeed, there could be vendors that support some degree of true parallelism through some of these JDBC types. I don't actually know this. For most of my professional life, I've spared myself the trouble of delving into these things and followed purely thread-bound transaction models as they are much easier to reason about...

Lukas Eder

unread,
Jun 8, 2020, 1:39:32 PM6/8/20
to jOOQ User Group
But why use temporary tables at all? It seems like their session-private-ness is the source of your wanting to reuse the connection. If you were using real tables, your reports could just commit the data and thus make them available to other sessions...?

--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jooq-user/1811714249.20200608190925%40am-soft.de.

Thorsten Schöning

unread,
Jun 9, 2020, 3:24:40 AM6/9/20
to jOOQ User Group
Guten Tag Lukas Eder,
am Montag, 8. Juni 2020 um 19:39 schrieben Sie:

> But why use temporary tables at all? It seems like their
> session-private-ness is the source of your wanting to reuse the connection.
> If you were using real tables, your reports could just commit the data and
> thus make them available to other sessions...?

That's what others have suggested as well already, but I see two major
problems in my concrete setup: Report-stuff is work in progress and
changes fast, but my schema is manually maintained and versioned using
custom update-SQL and stuff. So it introduces quite some overhead,
which can be avoided by using temp-tables in code only.

Additionally, the calculated data is really only valid per session and
its associated transaction anyway. New requests with new sessions
might come to different results, so I would need to think of how to
maintain those different results in non-session-local tables, clean
things up at some point etc. This is a bit easier with temp-tables.

The better compromise might be to cache the results per request in
memory instead of in temp-tables, saving roundtrips to fetch it again
this way as well. I was only using temp-tables in some queries anyway
already and had the idea to reuse them as session-wide cache instead
of using memory therefore.

The idea of concurrently using the same JDBC-connection came only
afterwards, when adding reports which don't query the database at all.
Those can be processed multi-threading with reports which do query the
database in theory, but I currently don't know which report accesses
the DB when scheduling those. So it would be great if concurrent
access of JDBC would just work and I don't need to care too much. :-)

I'll think about this whole stuff, so thanks about your suggestions!

Lukas Eder

unread,
Jun 9, 2020, 3:34:52 AM6/9/20
to jOOQ User Group
On Tue, Jun 9, 2020 at 9:24 AM Thorsten Schöning <tscho...@am-soft.de> wrote:
The idea of concurrently using the same JDBC-connection came only
afterwards, when adding reports which don't query the database at all.
Those can be processed multi-threading with reports which do query the
database in theory, but I currently don't know which report accesses
the DB when scheduling those. So it would be great if concurrent
access of JDBC would just work and I don't need to care too much. :-)

But my other suggestion stands. Why have concurrent access to the JDBC Connection when you can have a single process that manages this connection and that can receive messages (possibly async) to hide managing the connection for you?

With Oracle, I've done this many times in the past, using AQ as a means for inter-process communication, even across servers, not just across threads...

Thorsten Schöning

unread,
Jun 9, 2020, 6:15:49 AM6/9/20
to jOOQ User Group
Guten Tag Lukas Eder,
am Dienstag, 9. Juni 2020 um 09:34 schrieben Sie:

> But my other suggestion stands. Why have concurrent access to the JDBC
> Connection when you can have a single process that manages this connection
> and that can receive messages (possibly async) to hide managing the
> connection for you?

Things depend on how my app works currently and which I can't change
too easily:

SOAP-request -> 1 JDBC-connection -> app business logic [-> possibly
many threads] -> collect results -> 1 COMMIT/ROLLBACK -> SOAP-response

This way there's one transaction spanning the entire request managing
all changed data, whatever it is. There's no pool or else involved,
the app is pretty oold-school. Of course it could be added...

But even if there was a pool, I would still need to think of how to
process reports concurrently: Executing each in its own JDBC-conn.
means possibly re-executing the same query over and over again,
increasing the load on the system, when executing the query once per
SOAP-request would be enough. Or I would need to cache results in
memory somehow, which would require additional serialization on the
executed queries to execute the same query once only.

OTOH, multiple JDBC-connections allow concurrent exec of unrelated
reports, increasing per-request performance most likely. But what
happens if I at some point decide reports are not read-only anymore,
but keep some history as well? Already considered that.

My current transaction model commits/rollbacks per SOAP-request, all
or nothing, which doesn't hold with multiple connections. Instead, I
would need to keep them until shortly before the response to either
commit/rollback all of those or need to abandon the all-or-nothing I
have currently or ...

Currently I prefer all-or-nothing, makes life easier.
Reply all
Reply to author
Forward
0 new messages