Bind marker syntax

227 views
Skip to first unread message

Lukas Eder

unread,
Apr 8, 2019, 3:45:57 AM4/8/19
to r2dbc
Hello,

I'd like to follow up on the discussion in these tweets:

I must have misunderstood R2DBC's strategy here in the past, otherwise, I would have raised this issue before. To my (current) understanding, R2DBC does not plan on imposing the support of JDBC's bind marker syntax on R2DBC drivers. So, statements like

  SELECT * FROM t WHERE id = ?

will work only if a driver chooses to implement that syntax at its own discretion, but that's not a requirement. Instead, drivers are encouraged to enforce their vendor specific syntax instead, such as:

  -- Oracle and others
  SELECT * FROM t WHERE id = :1
  SELECT * FROM t WHERE id = :name

  -- SQL Server
  SELECT * FROM t WHERE id = @name

  -- PostgreSQL
  SELECT * FROM t WHERE id = $1

I think this is a mistake. Part of JDBC's success, in my opinion, is this standardisation of bind markers. It was always possible in JDBC to use vendor specific markers as the drivers could map indexes to names if they chose to support that. This made JDBC much easier to learn than if vendor specificity had to be documented everywhere. Note that driver implementors were always free to still support their database native syntax as well.

I understand Mark's feedback on Twitter, of course:

It does not make sense to force driver implementors to build an abstraction on top of native parameter placeholders. That’s a library feature, not a driver one.

But nothing prevents R2DBC from offering an out of the box implementation that maps ? to ?1, :1, $1, or whatever syntax a vendor supports, via an SPI. While this is a minor hassle for driver implementors (they've somehow managed for 22 years now), it's quite a hassle for API users. I know that a lot of people in the R2DBC team think that R2DBC will be mainly used by higher level abstractions and libraries, but I trust that end users will differ and also use R2DBC directly, just like JDBC has always been used directly.

Lukas

Gonzalo Ortiz Jaureguizar

unread,
Apr 8, 2019, 4:03:55 AM4/8/19
to r2dbc
+1

I understand that supporting a generic binding marker would make the driver implementations more complex, but without that the complexity is delegated on the libraries built on top of R2DBC, which will result in libraries that only support some databases, which is very bad for the ecosystem.

Dave Cramer

unread,
Apr 8, 2019, 6:39:43 AM4/8/19
to Gonzalo Ortiz Jaureguizar, r2dbc
On Mon, 8 Apr 2019 at 04:03, Gonzalo Ortiz Jaureguizar <golth...@gmail.com> wrote:
+1

I understand that supporting a generic binding marker would make the driver implementations more complex, but without that the complexity is delegated on the libraries built on top of R2DBC, which will result in libraries that only support some databases, which is very bad for the ecosystem.


I'd have to agree. That said please don't make it '?' This is already an overloaded operator in PostgreSQL
 
Dave Cramer
El lunes, 8 de abril de 2019, 9:45:57 (UTC+2), Lukas Eder escribió:
Hello,

I'd like to follow up on the discussion in these tweets:

I must have misunderstood R2DBC's strategy here in the past, otherwise, I would have raised this issue before. To my (current) understanding, R2DBC does not plan on imposing the support of JDBC's bind marker syntax on R2DBC drivers. So, statements like

  SELECT * FROM t WHERE id = ?

will work only if a driver chooses to implement that syntax at its own discretion, but that's not a requirement. Instead, drivers are encouraged to enforce their vendor specific syntax instead, such as:

  -- Oracle and others
  SELECT * FROM t WHERE id = :1
  SELECT * FROM t WHERE id = :name

  -- SQL Server
  SELECT * FROM t WHERE id = @name

  -- PostgreSQL
  SELECT * FROM t WHERE id = $1

I think this is a mistake. Part of JDBC's success, in my opinion, is this standardisation of bind markers. It was always possible in JDBC to use vendor specific markers as the drivers could map indexes to names if they chose to support that. This made JDBC much easier to learn than if vendor specificity had to be documented everywhere. Note that driver implementors were always free to still support their database native syntax as well.

I understand Mark's feedback on Twitter, of course:

It does not make sense to force driver implementors to build an abstraction on top of native parameter placeholders. That’s a library feature, not a driver one.

But nothing prevents R2DBC from offering an out of the box implementation that maps ? to ?1, :1, $1, or whatever syntax a vendor supports, via an SPI. While this is a minor hassle for driver implementors (they've somehow managed for 22 years now), it's quite a hassle for API users. I know that a lot of people in the R2DBC team think that R2DBC will be mainly used by higher level abstractions and libraries, but I trust that end users will differ and also use R2DBC directly, just like JDBC has always been used directly.

Lukas

--
You received this message because you are subscribed to the Google Groups "r2dbc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r2dbc+un...@googlegroups.com.
To post to this group, send email to r2...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/r2dbc/c5c23989-1b10-46e7-85f6-277e0f7c2905%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Lukas Eder

unread,
Apr 8, 2019, 6:57:22 AM4/8/19
to Dave Cramer, Gonzalo Ortiz Jaureguizar, r2dbc
On Mon, Apr 8, 2019 at 12:39 PM Dave Cramer <davec...@gmail.com> wrote:
I'd have to agree. That said please don't make it '?' This is already an overloaded operator in PostgreSQL

That's quite an edge case in ordinary PostgreSQL usage, just like in Oracle and the SQL Standard, where '?' became a syntax element for MATCH_RECOGNIZE. Notice that Oracle and the ISO standards committee are working on defining a standard escape sequence for these cases, which will be useful given how significant '?' is in JDBC:

I think that deviating from '?' as the default is not necessarily wise, as there will most definitely some other conflict with some other syntax by some other vendor that will probably not be foreseen. The limitations of '?' have been understood by now. In PostgreSQL, for example, there's often a workaround for not being able to use '?' through JDBC, e.g.:

mpa...@pivotal.io

unread,
Apr 8, 2019, 10:25:18 AM4/8/19
to r2dbc
Thanks for continuing the discussion on our ML.

I see a few forces in this discussion and I'd like to hear more why you think we should (re)introduce standardized bind markers.

Here are my thoughts:

* Bind markers are a user-facing feature. We intentionally keep R2DBC SPI minimal and aim for encapsulation of just the database specifics in drivers. We do not want each driver vendor to put a user-specific abstraction into their drivers.
* You could argue the same way about R2DBC Connection URL. It's one of the most prominent user-facing features. We decided here to solve that requirement once within R2DBC so that drivers don't need to deal with URL parsing and come up with fancy connection URLs.
* There's a certain level of SQL portability. Probably most CRUD applications. When starting with pagination, it already starts getting difficult.
* We consider SQL portability as client library or even application-level concern. There are a ton of JDBC clients that address various areas of SQL database access: Connection/Transaction management, SQL query repository (e.g. lookup/construction of SQL queries), SQL query builders, CRUD tools, object (relational) mappers. Each library has its specifics. It either passes-thru SQL (portability/store-specifics are an application-level concern), build SQL (specifics are a library concern) or introduce their own query language (specifics are a library concern). 
From that perspective, R2DBC usage requires an extension of dialect-specific support in SQL builder-like libraries and object (relational) mappers. Users are shielded from this requirement because there's a library in front of them.
SQL query repositories and direct SQL usage require a bit effort on the application development side and potentially more for simple portable queries that are not dialect-specific. In any case, we're talking about new applications and we're not breaking code because that applications do not exist yet.
* I fear it does not matter which syntax we chose, there's likely a database that will have issues with it.
* As a comparison: Every other programming language uses native markers (or they don't use markers at all and materialize values in the actual query which opens up the path to potential vulnerabilities)

Cheers, 
Mark

Lukas Eder

unread,
Apr 8, 2019, 11:36:04 AM4/8/19
to r2dbc


On Monday, April 8, 2019 at 4:25:18 PM UTC+2, mpa...@pivotal.io wrote:
Thanks for continuing the discussion on our ML.

I see a few forces in this discussion and I'd like to hear more why you think we should (re)introduce standardized bind markers.

Here are my thoughts:

* Bind markers are a user-facing feature. We intentionally keep R2DBC SPI minimal and aim for encapsulation of just the database specifics in drivers. We do not want each driver vendor to put a user-specific abstraction into their drivers.

The encapsulation of the database specifics might as well include the translation of a standard parameter syntax to the database specifics. This is not a clear cut true/false discussion, more of a cost/benefit ratio discussion.
 
* You could argue the same way about R2DBC Connection URL. It's one of the most prominent user-facing features. We decided here to solve that requirement once within R2DBC so that drivers don't need to deal with URL parsing and come up with fancy connection URLs.

Yes, and in fact, I do argue the same way about it. Does R2DBC not support the exact same functionality that JDBC connection URLs support? An entire 22 year old tool chain profits from how things used to work. Are your goals really "just" bringing reactiveness to the Java/SQL interface, or do you want a revolution? :) Currently, we can just pass a well understood URL to some Spring configuration or what not, and "magically", we get a well managed connection pool that doesn't really need to understand everything about the backing RDBMS. Because the URLs are quite standardised. That's a great feature of JDBC, isn't it?

Of course, you can do whatever you want. But in terms of chances for adoption, I wouldn't reinvent too many wheels that are not strictly related to the "reactive" part of the API. That may just be my opinion, of course.
 
* There's a certain level of SQL portability. Probably most CRUD applications. When starting with pagination, it already starts getting difficult.

Indeed, JDBC's {limit} escape syntax is hardly known and probably doesn't work on most complex queries. However, the SQL standard has finally standardised the syntax in 2008 (11 years ago) through OFFSET .. FETCH, and it finds wider and wider adoption, including at least:

- DB2
- Derby
- H2
- Ingres (almost)
- Oracle
- PostgreSQL (now supporting both syntaxes)
- SQL Server

I'll expect more dialects to follow. Of course, at some point, you'll need something like jOOQ for full abstraction over all syntax.

But why not aim for this quick win nonetheless? Take CRUD, again. That might be 90% of all queries in ordinary OLTP systems. Standardising bind markers will help those 90%.

In hindsight, we can tell which JDBC features were good ideas (standardised URLs, bind parameter syntax), and which ones weren't (some of the other escape syntax)
 
* We consider SQL portability as client library or even application-level concern. There are a ton of JDBC clients that address various areas of SQL database access: Connection/Transaction management, SQL query repository (e.g. lookup/construction of SQL queries), SQL query builders, CRUD tools, object (relational) mappers. Each library has its specifics. It either passes-thru SQL (portability/store-specifics are an application-level concern), build SQL (specifics are a library concern) or introduce their own query language (specifics are a library concern). 
From that perspective, R2DBC usage requires an extension of dialect-specific support in SQL builder-like libraries and object (relational) mappers. Users are shielded from this requirement because there's a library in front of them.
SQL query repositories and direct SQL usage require a bit effort on the application development side and potentially more for simple portable queries that are not dialect-specific.

Sure, and as the vendor of jOOQ, I'm not really helping my own cause in trying to convince you to move this logic into the API :)

But in this particular case, instead of putting the "burden" of parsing the markers onto the driver implementors, you're putting it onto the clients, including e.g. tons of vendor agnostic SQL editors and tools like DBeaver, who will have to implement the same parameter generation logic again and again and again.

My suggestion here is that you could provide a default implementation that makes it very easy to translate between the standard JDBC syntax and the vendor specific one.
 
In any case, we're talking about new applications and we're not breaking code because that applications do not exist yet.

You're talking about that, and I'm talking about adoption. R2DBC's adoption will probably be faster if there are less hurdles for users.
 
* I fear it does not matter which syntax we chose, there's likely a database that will have issues with it.

Yes, hence the one that everyone already uses in JDBC has a big advantage.
 
* As a comparison: Every other programming language uses native markers (or they don't use markers at all and materialize values in the actual query which opens up the path to potential vulnerabilities)

Well, there's at least ODBC, which inspired most of JDBC :)

JDBC is the envy of many other programming languages, who often don't even have a standard API.

Notice, again, making standard markers the default doesn't mean users cannot use the vendor specific ones. In fact, I've used :name via JDBC in Oracle many times, myself...

Gonzalo Ortiz Jaureguizar

unread,
Apr 8, 2019, 11:47:11 AM4/8/19
to mpa...@pivotal.io, r2dbc
El lun., 8 abr. 2019 a las 16:25, <mpa...@pivotal.io> escribió:

* As a comparison: Every other programming language uses native markers (or they don't use markers at all and materialize values in the actual query which opens up the path to potential vulnerabilities)

I'm not an expert on other programming languages, but each time I try to access a database from non-JVM-languages I really miss the standardization that JDBC brings to the ecosystem.

Tons of decisions taken when Java was created seem deprecated today (including the blocking nature of JDBC!), but JDBC standardization (IMHO) was one of the best decisions they made.
 

Cheers, 
Mark

Am Montag, 8. April 2019 12:57:22 UTC+2 schrieb Lukas Eder:


On Mon, Apr 8, 2019 at 12:39 PM Dave Cramer <davec...@gmail.com> wrote:
I'd have to agree. That said please don't make it '?' This is already an overloaded operator in PostgreSQL

That's quite an edge case in ordinary PostgreSQL usage, just like in Oracle and the SQL Standard, where '?' became a syntax element for MATCH_RECOGNIZE. Notice that Oracle and the ISO standards committee are working on defining a standard escape sequence for these cases, which will be useful given how significant '?' is in JDBC:

I think that deviating from '?' as the default is not necessarily wise, as there will most definitely some other conflict with some other syntax by some other vendor that will probably not be foreseen. The limitations of '?' have been understood by now. In PostgreSQL, for example, there's often a workaround for not being able to use '?' through JDBC, e.g.:

--
You received this message because you are subscribed to the Google Groups "r2dbc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r2dbc+un...@googlegroups.com.
To post to this group, send email to r2...@googlegroups.com.

John Blum

unread,
Apr 8, 2019, 12:03:03 PM4/8/19
to Gonzalo Ortiz Jaureguizar, Mark Paluch, r2dbc
+1 for not ever putting the burden on users in their applications.  

JDBC, like many other Java specifications (and SPIs) is about providing a consistent experience (LookN'Feel) across platforms.  Indeed, this is the very nature of the JVM itself.  Vendors, given the chance, are most likely going to do things in their best interest, including right up to vendor lock-in.  Most good, established frameworks (Spring, jOOQ, etc) and the APIs they provide are about giving users choice, shielding them from the nuances of the underlying technology.  I would not take that away from users ever and I cannot count how many times a good framework save my arsed in this regard when developing an application.

Even SQL (standards) itself provides commonality across DBMS vendors.

I think we would do well to adopt as many ideas/features from JDBC as possible, for consistency sake, which will only remove the barriers to adoption and the learning curve to Reactive.

$0.02
-John



For more options, visit https://groups.google.com/d/optout.


--
-John
john.blum10101 (skype)

Dave Cramer

unread,
Apr 8, 2019, 3:25:33 PM4/8/19
to Lukas Eder, Gonzalo Ortiz Jaureguizar, r2dbc
Lukas,

I'd like to respectfully disagree on the narrowness of the ? operator in PostgreSQL.

It is now being used by JSONB for string searching of keys https://www.postgresql.org/docs/current/functions-json.html

Dave Cramer

Lukas Eder

unread,
Apr 9, 2019, 2:54:13 AM4/9/19
to Dave Cramer, Gonzalo Ortiz Jaureguizar, r2dbc
Dave,

As I mentioned in a previous email (through this link: https://stackoverflow.com/a/38370973/521799), the operator is backed by a function in PostgreSQL, which could be used as a workaround (yes, doing so has its own caveats as indexes may not apply):

  jsonb_exists(jsonb, text)
  exist(hstore, text)

In any case, the suggestion made by Oracle to introduce a new escape syntax in the SQL language itself for such cases is promising.

Having said so, nothing would have prevented PostgreSQL's JDBC driver to introduce their own escape syntax for this edge case over the past years :)

Ben Hale

unread,
Apr 15, 2019, 5:10:23 PM4/15/19
to r2dbc, Dave Cramer, Gonzalo Ortiz Jaureguizar, Lukas Eder
To respond to some of the questions Lukas raised about the motivations around R2DBC, let me see if I can clarify a bit.

I do, to some extent, see R2DBC as an opportunity for a revolution. There are certain places (like three different kinds of Statements) where I think challenging the status quo is a good idea. But I'm not a fan of tossing out everything JDBC just because.

This marker syntax thing is quite interesting to me in this respect. I took this idea from one of ADBA's primary design goals, as the idea that there was no "portable enough" SQL that _could_ be shared with a generic bind marker. I'd always assumed that the ORM's wrote hyper-optimized SQL for each dialect and going a step further and asking them to dialect their bind markers wouldn't be too much to ask.

Now you, a true expert in the field, come back and point out that this isn't true and that there are large swathes of SQL that can be written "portably enough" to warrant a generic bind parameter. With that knowledge, I'm open to adding it to R2DBC. The question in my mind is how much of this is a hang-over from existing JDBC implementations (a bit of Stockholm Syndrome) versus what is an ideal situation a couple of years from now.

I think the best way to explore this today is trying to split the difference and adding a proxy implementation that applies a JDBC compatibility layer to the SQL going by. Exploring the solution this way gives us a couple of benefits as I see it. In one case it turns out that this compatibility behavior isn't necessary in the long term (as more clients join up) and its use becomes optional and decreasing. In the other case it turns out compatibility is *critical* for clients and we've managed to implement it in a really decoupled way so it can be included as part of the spec implementation (not the drivers) or used by the drivers as a library. Either outcome seems reasonable.


Separately, to my mind, the discussion about JDBC URL compatibility is much more fraught. I totally understand and deeply appreciate the benefits to understanding existing JDBC URIs. At the same time, I've got 15 years of experience and customer complaints about how inscrutable and inconsistently parsed those URIs are. I think there's very significant value to a better definition of database connection URIs and given that each R2DBC driver would need to mimic the same parsing behaviors and oddities but with new code, I'm not sure how to rate the value compatibility provides. It is worth thinking about some more though as slotting into the existing world is appealing.


-Ben
> --
> You received this message because you are subscribed to the Google Groups "r2dbc" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to r2dbc+un...@googlegroups.com.
> To post to this group, send email to r2...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/r2dbc/CAB4ELO7o8XsRYpYgCeU2Z-py8RE%3DCsuR1fc9RE71Gr%3DGgbdvvw%40mail.gmail.com.

Lukas Eder

unread,
Apr 16, 2019, 3:56:54 AM4/16/19
to Ben Hale, r2dbc, Dave Cramer, Gonzalo Ortiz Jaureguizar
Ben,

Thank you very much for your thoughtful words.

On Mon, Apr 15, 2019 at 11:10 PM Ben Hale <bh...@pivotal.io> wrote:
I do, to some extent, see R2DBC as an opportunity for a revolution.  There are certain places (like three different kinds of Statements) where I think challenging the status quo is a good idea.  But I'm not a fan of tossing out everything JDBC just because.

I absolutely agree. JDBC made many design mistakes, and the three kinds of Statements part is one of them.
 
This marker syntax thing is quite interesting to me in this respect.  I took this idea from one of ADBA's primary design goals, as the idea that there was no "portable enough" SQL that _could_ be shared with a generic bind marker.  I'd always assumed that the ORM's wrote hyper-optimized SQL for each dialect and going a step further and asking them to dialect their bind markers wouldn't be too much to ask.

Do note that I'm not writing these things merely from the perspective of an ORM vendor, which indeed, can easily generate dialect specific SQL. But there is a large community of people who do not want any ORM or even a SQL abstraction library like jOOQ to get in between their client code and the RDBMS. If you look at a number of third party surveys, you will see that the likes of JDBC, JdbcTemplate, JDBI, MyBatis, and even jOOQ in "plain SQL mode" as well as native queries in JPA/Hibernate are not too unpopular. There is always a part of someone's application where native SQL is the best choice. And I'm not referring to complex, vendor specific SQL, I'm referring to standard SQL-92 type SQL, which is just too much for an ORM to handle (e.g dozens of joins). For that part, generic bind markers are a really useful feature.

These people would want their SQL to still work on R2DBC just the same. Imagine, a library like JDBI, which is completely text based and does not interfere with user SQL, would be able to just add 2-3 methods that bind to R2DBC instead of JDBC. For users, this would be completely transparent.
 
Now you, a true expert in the field, come back and point out that this isn't true and that there are large swathes of SQL that can be written "portably enough" to warrant a generic bind parameter.  With that knowledge, I'm open to adding it to R2DBC.  The question in my mind is how much of this is a hang-over from existing JDBC implementations (a bit of Stockholm Syndrome) versus what is an ideal situation a couple of years from now.

I think the best way to explore this today is trying to split the difference and adding a proxy implementation that applies a JDBC compatibility layer to the SQL going by.  Exploring the solution this way gives us a couple of benefits as I see it.  In one case it turns out that this compatibility behavior isn't necessary in the long term (as more clients join up) and its use becomes optional and decreasing.  In the other case it turns out compatibility is *critical* for clients and we've managed to implement it in a really decoupled way so it can be included as part of the spec implementation (not the drivers) or used by the drivers as a library.  Either outcome seems reasonable.

I very much like that idea.

Separately, to my mind, the discussion about JDBC URL compatibility is much more fraught.  I totally understand and deeply appreciate the benefits to understanding existing JDBC URIs.  At the same time, I've got 15 years of experience and customer complaints about how inscrutable and inconsistently parsed those URIs are.  I think there's very significant value to a better definition of database connection URIs and given that each R2DBC driver would need to mimic the same parsing behaviors and oddities but with new code, I'm not sure how to rate the value compatibility provides.  It is worth thinking about some more though as slotting into the existing world is appealing.

Sure. I, for one, have much less strong opinions about those URLs than about bind markers :) 
Reply all
Reply to author
Forward
0 new messages