Proposal for a Stored Procedures API (1.0 M9 preparation)

mpa...@pivotal.io

unread,

Mar 6, 2019, 9:14:22 AM3/6/19

to r2dbc

Hi all,

We've been stuck on the stored procedures request [0] for quite a while.

This proposal explains a possible approach so we can prepare a pull request

for R2DBC SPI 1.0 M9.

We'd like to get feedback on this issue from you.

The idea of this proposal is to enable (low-level) consumption of response

items that a database server sends in return to executing a stored

procedure (callable function?) beyond rows, a single row count,

and a single error.

Stored procedures allow returning multiple result sets, errors, and

out-parameters (feel free to extend this list if it is not complete).

I'm proposing to expose low-level response segments for consumption

for a versatile and extensible mechanism to consume query results beyond

aggregated row/update count consumption.

We would extend the Result type to accept a mapping function for response

segments. As of now, we've identified the following possible segments:

* OutParameter (Out-parameter descriptor)

* Row (Exists already)

* Notice (Info/Error notification)

* RowCount (Encapsulates the affected rows count)

A client can consume these segments as these items are read from the stream.

On the Statement side, we need a mechanism to register out parameters for certain

databases. Some databases do not require upfront parameter registration, some do,

and in some cases, we require a specific type declaration to call the right function

if stored procedures are overloaded.

I'm proposing to use Statment.bind() with a specific Parameter wrapper type

to not introduce additional methods.

See the following code for how the code would look like that uses the proposed changes:

Connection connection = …;

Publisher<String> functionResult = connection.createStatement("CALL my_function($1, $2, $3, $4)")

.bind(0, Parameter.out(String.class))

.bind(1, Parameter.out(String.class, "VARCHAR(400)"))

.bind(2, Parameter.inOut("the-value"))

.bind(3, Parameter.inOut(String.class, "NVARCHAR(8000)"))

.execute()

.flatMap(result -> result. map(OutParameter.class::isInstance, response -> {

OutParameter param = (OutParameter) response;

return param.get(String.class).

}));

I prepared a gist that contains all proposed types and changes at [1].

This proposal is a draft to illustrate the approach. It does not only

solve consumption of out parameters but also allows a more

fine-grained consumption of items returned by a Statement execution.

All naming is in Flux(!) and subject to suggestions.

The main questions to proceed from here are:

* Did we consider all possible outcomes of a stored procedure call?

* Is Result.map(Predicate, Function) or Result.flatMap(Function) a good way

to consume results of a stored procedure execution?

* How else could we consume responses without returning null and

reduced overhead to not always return empty publishers?

* Did we consider all cases in out/inout parameter registration?

* What is your general feedback on the API?

Special thanks go to Lukas Eder who raised the topic and explained a

lot of quirks of stored procedures to us.

I would also ask for feedback on this proposal of from JDBC driver maintainers

in particular.

Cheers,

Mark

[0] https://groups.google.com/forum/#!topic/r2dbc/Y-aFEQ3G4Eg

[1] https://gist.github.com/mp911de/84076bcbacbfa8f6a8aecaa73be70993

Lukas Eder

unread,

Mar 7, 2019, 3:58:02 AM3/7/19

to r2dbc

Hey there,

That was quick :)

Some further feedback from me.

String-typed "declaration" of OUT parameter types

https://gist.github.com/mp911de/84076bcbacbfa8f6a8aecaa73be70993#file-parameter-java-L43

As discussed on the phone, there are cases where the type of an IN / OUT parameter cannot be decided by the driver with 100% certainty, especially when nulls are involved (which are type less in Java, but not in SQL) and functions / procedures are overloaded. In this case, in order to disambiguate e.g. between VARCHAR and TEXT/CLOB (which both map to String.class in Java but may be different things in SQL), a parameter type reference is needed.

However, I would strongly suggest against using Strings here. With Strings, the possibilities are endless, and the ambiguities that may arise from this are endless, too. For example, will VARCHAR(500) work on all RDBMS? Or do some drivers not accept the length? Will some drivers be able to interpret this as VARCHAR2 (e.g. Oracle)?

Ultimately, I don't think you will get around an enumeration similar to java.sql.Types or maybe better java.sql.SQLType and java.sql.JDBCType.

Notice, this affects also other API that is not strictly related to stored procedures. For example, it must be possible to specify the parameter type when setting null on a prepared statement:

https://github.com/r2dbc/r2dbc-spi/blob/v1.0.0.M7/r2dbc-spi/src/main/java/io/r2dbc/spi/Statement.java#L159

IN OUT flag

This is interesting:

https://gist.github.com/mp911de/84076bcbacbfa8f6a8aecaa73be70993#file-parameter-java-L126

There are 3 states for a parameter: IN, IN OUT, OUT. In my opinion, you should not model these states with an inOut boolean and claim that the parameter is an OUT parameter, when it is not an IN OUT parameter.

I get that you're thinking of supporting this Parameter API only for IN OUT and OUT parameters, not for IN parameters. But why be clever about this? Wouldn't it be nicer and more consistent and regular if all three parameter modes would be offered through the same API?

Naming of OutParameter:

I suggest not to name this type "Parameter", because a parameter is the formal parameter specification. An argument would be the value that is passed to a parameter. In this case, maybe OutValue might do? There are other possible names, but I find Parameter confusing, specifically because there is already Parameter.out(), which is a "Parameter", not an "OutParameter"

https://gist.github.com/mp911de/84076bcbacbfa8f6a8aecaa73be70993#file-responses-java-L19

Naming of Result vs Response:

This is definitely confusing at the moment. Response is an individual element in the Result stream, but the names don't clearly indicate that. I don't have a better suggestion at the moment.

Javadoc on OutParameter:

I'm not aware of triggers being able to produce out parameter values

https://gist.github.com/mp911de/84076bcbacbfa8f6a8aecaa73be70993#file-responses-java-L14

Cursors on OutParameter:

One common use case in e.g. Oracle, maybe a bit less so in PostgreSQL, is to return various cursors through out parameters. How would this be done using this API here:

https://gist.github.com/mp911de/84076bcbacbfa8f6a8aecaa73be70993#file-responses-java-L41

You have Row extend Response, but I think that's insufficient as a cursor is several rows.

Also, don't forget that cursors can also appear outside of out parameters.

Why Predicate on Result.map():

In the presence of flatMap(), I don't see the point of the Predicate on the Result.map() method:

https://gist.github.com/mp911de/84076bcbacbfa8f6a8aecaa73be70993#file-result-java-L14

I'd rather there was a filter(Predicate<? super Response>):Result method and then <T> map(Function<? super Response, ? extends T>):Publisher<T> and the existing flatMap() method. Other "Stream" methods would be thinkable as well, including findFirst() or findAny() or skip(long) or limit(long), etc.

Notice, the Predicate and Function arguments should be contravariant, not covariant.

Another notice, the naming map() and flatMap() make it look as though Result is a monad (if you remove the Predicate from map()). But it isn't. Both methods return Publisher, not Result again. Now, Result could be made a monad if Result extends Publisher<Response>. Not sure if that's a good idea, though. Perhaps, alternative names would be better? I don't know.

Anyway, with the filter(Predicate) instead of map(Predicate, Function) suggestion, your example code at the bottom would be changed to:

Publisher<String> functionResult = connection.createStatement("CALL my_function($1, $2, $3)")

.bind(0, Parameter.out(String.class))

.bind(1, Parameter.inOut("foo", "VARCHAR(400)"))

.bind(2, Parameter.out(String.class, "NVARCHAR(8000)"))

.execute()

.flatMap(result -> result.filter(OutParameter.class::isInstance).map(response -> {

OutParameter param = (OutParameter) response;

return param.get(String.class);

}));

I think that's cleaner and more composable.

Exceptions:

With the new design, I take that database exceptions will be passed to Result as Notice instances:

https://gist.github.com/mp911de/84076bcbacbfa8f6a8aecaa73be70993#file-responses-java-L67

This means that drivers will not actually throw exceptions. This is much nicer for the edge case of calling procedural logic (especially in T-SQL) that generates several such exceptions as I had presented in this message:

https://groups.google.com/d/msg/r2dbc/QZpTpQtj1HA/VT0ob00WCQAJ

However, the normal case for an exception is:

1. Syntax error

2. Semantic exception, such as a constraint violation exception

3. Other similar cases.

In these cases, the will be the *only* result. Of course, the API would be more consistent and regular if this would still produce a "stream" of Result containing a single Notice and the call site can decide themselves how to proceed with that Notice.

Sergei Egorov

unread,

Mar 7, 2019, 4:21:05 PM3/7/19

to Lukas Eder, r2dbc

Hi Lukas,

Just FYI:

.flatMap(result -> result.filter(OutParameter.class::isInstance).map(response -> {

OutParameter param = (OutParameter) response;

return param.get(String.class);

}));

can be simplified to to:

.flatMap(result -> result.ofType(OutParameter.class).map(param -> {

return param.get(String.class);

}));

Thanks to the {Flux,Mono}#ofType operator :)

Best Regards,

Sergei

--
You received this message because you are subscribed to the Google Groups "r2dbc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r2dbc+un...@googlegroups.com.
To post to this group, send email to r2...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/r2dbc/ab72d1ea-5629-4a40-ade0-ec9ea258ebb1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mpa...@pivotal.io

unread,

Mar 8, 2019, 2:24:01 AM3/8/19

to r2dbc

Thanks a lot Lukas, for your prompt and elaborate feedback.

I've got some questions and remarks to your feedback.

String-typed "declaration" of OUT parameter types

I wasn't sure whether stored procedure overloading makes a distinction between VARCHAR(10) and VARCHAR(200) or whether it's just up to the base type.

I like the idea of a type enum and an interface for vendor-specific extensions.

IN/INOUT/OUT Parameters

That's a decent proposal to extend parameter wrapping also to IN parameters and prepared statement arguments.

Naming

Definitively a challenge to come up with good names.

Javadoc

We will fix that in the pull request. I guess we will also have to add further constraints and implementation notes.

Cursors as procedure results

My thinking here was to either provide a Cursor object, similar to Result (discarded that idea after our call). Alternatively, we can emit multiple Result objects (top-level) that encapsulate result consumption. In any case, I'm not exactly sure about this. From an API evolution perspective, I'd consider this requirement as something that can be added at a later time without breaking the API.

map vs. flatMap

I'm bothered by the nature of when data in flatMap is processed. Map applies a mapping function immediately, without lazy evaluation. flatMap embraces the deferred nature of Publisher. The reason I'm bringing this up is that a Row (and essentially every data structure holding data) is typically associated with resources (e.g., pooled buffers). Modeling the mapping as map(…) allows drivers to apply a mapping function and then release the Row/OutValue after the call. With flatMap, we wrap/keep the reference to the Row/OutValue beyond completion of the flatMap call, and we leave the object in the control of the caller.

Giving control to the caller imposes the requirement of resource disposal on the caller, and that's something we want to avoid at all costs. Otherwise, this functionality becomes a guarantee for resource leaks.

Exceptions

Very good point, we didn't consider the interaction of general exceptions with stored procedure result consumption. As you mentioned, we need to define a basic set of exception (transport, protocol, permission, integrity, and grammar) that are emitted as error signal regardless of notice consumption.

I really like the direction and how this all tightens up the API. Thanks a lot.

Happy to hear more opinions/feedback.

Cheers,

Mark

Lukas Eder

unread,

Mar 8, 2019, 4:16:01 PM3/8/19

to r2dbc

Comments inline

On Friday, March 8, 2019 at 8:24:01 AM UTC+1, mpa...@pivotal.io wrote:

Cursors as procedure results

My thinking here was to either provide a Cursor object, similar to Result (discarded that idea after our call). Alternatively, we can emit multiple Result objects (top-level) that encapsulate result consumption. In any case, I'm not exactly sure about this. From an API evolution perspective, I'd consider this requirement as something that can be added at a later time without breaking the API.

I don't know. I mean, currently, you're thinking of Row as a subtype of this new Response type. With that in place, it will be confusing / difficult to retrofit another Response subtype to do the right thing here. I personally don't think the scope of what's possible with JDBC is so large that this needs to be postponed. I mean, you also have to think about how LOBs are returned from stored procedures, or maybe large arrays. So, returning cursors (both as out parameters as well as secondary result sets) is not really something I'd postpone in the design of stored procedure support.

Your call, of course.

map vs. flatMap

I'm bothered by the nature of when data in flatMap is processed. Map applies a mapping function immediately, without lazy evaluation. flatMap embraces the deferred nature of Publisher. The reason I'm bringing this up is that a Row (and essentially every data structure holding data) is typically associated with resources (e.g., pooled buffers). Modeling the mapping as map(…) allows drivers to apply a mapping function and then release the Row/OutValue after the call. With flatMap, we wrap/keep the reference to the Row/OutValue beyond completion of the flatMap call, and we leave the object in the control of the caller.

Giving control to the caller imposes the requirement of resource disposal on the caller, and that's something we want to avoid at all costs. Otherwise, this functionality becomes a guarantee for resource leaks.

But this doesn't contradict my filter() method suggestion and, if you will, Sergei's ofType() suggestion?

Do note that I'm mainly criticising the recycling of well established names for monadic methods in a non-monadic way. map(Predicate, Function) just doesn't "feel right".

Reply all

Reply to author

Forward