Let's dream about DSL generation and documentation [Was: How to deal with Java's reserved keywords]

91 views
Skip to first unread message

Christopher Deckers

unread,
Jun 23, 2013, 1:28:41 PM6/23/13
to jooq...@googlegroups.com
Hi Lukas,

There starts to be several discussions that could be brought together, revolving around the actual construction of the jOOQ DSL, its formalism, supported dialects and versions, etc.


Created a new issue for jOOQ 4.0, to keep in the back of our heads:

If the two APIs can be maintained in parallel, the change might also be introduced in a minor release in 3.x

"For backwards-compatibility reasons, it would be important to maintain the two APIs in parallel, potentially using a code generator to generate one API from the other"


With Christopher Decker's recent helping in creating a railroad-diagram tool [1] and Peter Verhas's inspiration, mapping a finite state-machine to a fluent API (and / or a BNF), the language-aspects of jOOQ will hopefully be heavily improved in the near future [3].


So, what would be the ideal solution of creating jOOQ to have all the features we dream of?

Suppose a user wants one of (these are just some of the possible examples):
- strict API for a particular dialect and version of one RDBMS.
- strict+simulated for a common subset of dialects.
- all strict+simulated for all dialects.
- naming strategy as all caps.
- naming strategy as camelCase.
- documentation with BNF syntax and railroad diagrams for their selected subset.

It would be interesting to have the standard jOOQ being one set of these possible choices but the user be able to generate/download his flavor that restricts/adjusts jOOQ API.

Then, it means the DSL generator is the central piece. The DSL generator generates interfaces and methods (with Javadoc) that match the desired criteria, as well as a BNF notation, from which syntax can be studied eventually in the form of railroad diagrams.
The generator would somehow generate the implementation of these interfaces that call core classes that never change. The whole jOOQ internals never change, only the facade that is exposed to the user. A different set of criteria simply means a different facade. Code completion would be great and to the point!


For example, my very own set of criteria would be:
- all caps.
- union of strict APIs for SQLServer 2008+ and Oracle 10.

That way, the API would not simutate anything (strict SQL) and I would be able to call SQLServer or Oracle specific functions conditionally when I need. Of course, I would have to be careful not to call Oracle-specific stuff when on SQLServer, but I would likely generate the strict API of each just to keep the corresponding documentations and railroad diagrams at hand.

Cheers,
-Christopher

Lukas Eder

unread,
Jun 25, 2013, 3:36:34 AM6/25/13
to jooq...@googlegroups.com
Hi Christopher,
 
Thanks for a wonderful compilation of jOOQ's mid-term mission!
A couple of comments, inline:
 
2013/6/23 Christopher Deckers <chr...@gmail.com>

Hi Lukas,

There starts to be several discussions that could be brought together, revolving around the actual construction of the jOOQ DSL, its formalism, supported dialects and versions, etc.


Created a new issue for jOOQ 4.0, to keep in the back of our heads:

If the two APIs can be maintained in parallel, the change might also be introduced in a minor release in 3.x

"For backwards-compatibility reasons, it would be important to maintain the two APIs in parallel, potentially using a code generator to generate one API from the other"


With Christopher Decker's recent helping in creating a railroad-diagram tool [1] and Peter Verhas's inspiration, mapping a finite state-machine to a fluent API (and / or a BNF), the language-aspects of jOOQ will hopefully be heavily improved in the near future [3].


So, what would be the ideal solution of creating jOOQ to have all the features we dream of?

Suppose a user wants one of (these are just some of the possible examples):
- strict API for a particular dialect and version of one RDBMS.
- strict+simulated for a common subset of dialects.
- all strict+simulated for all dialects.
- naming strategy as all caps.
- naming strategy as camelCase.
- documentation with BNF syntax and railroad diagrams for their selected subset.
 
Nice. So the API generation configuration dimensions are:
- Dialect support
- Strictness
- API naming strategies
 
Note, I'm thinking about distinguishing "strictness" between
- No simulation (strict)
- "simple" simulation: syntactic equivalence, such as NVL(a, b) = CASE WHEN a IS NOT NULL THEN a ELSE b END
- "transformative" simulation: transforming SQL to form very different SQL, e.g. LIMIT / OFFSET in Oracle, SQLServer 2008, LPAD() in SQLite (see other thread), or row value expressions and IN predicates:

It would be interesting to have the standard jOOQ being one set of these possible choices but the user be able to generate/download his flavor that restricts/adjusts jOOQ API.

Yes. The standard jOOQ would certainly include:
- "transformative" simulation (no "strictness")
- All SQL dialects
- the camelCase() API

Its goal is to get new users started as quickly as possible. Maybe, there will also be a parallel distribution with the UPPER_CASE() API for all dialects. The goal is that only advanced users will actually have to / want to make use of custom API generation.

Then, it means the DSL generator is the central piece. The DSL generator generates interfaces and methods (with Javadoc) that match the desired criteria, as well as a BNF notation, from which syntax can be studied eventually in the form of railroad diagrams.

I'm even dreaming of putting RRDiagram SVGs into the generated Javadoc :-)
 
The generator would somehow generate the implementation of these interfaces that call core classes that never change. The whole jOOQ internals never change, only the facade that is exposed to the user. A different set of criteria simply means a different facade. Code completion would be great and to the point!

Good point. So this creates the bridge to the clean DSL vs. model API separation, where the model API is mutable for dynamic SQL construction, whereas the DSL API can easily become immutable, as all the tedious work of making the DSL implementation immutable could be generated.

The model API would be the only one manually annotated with:

- @Support: documenting what SQLDialect supports a clause / method in strict / simple / transformative mode
- @State / @Transition: documenting the language aspects (this may change as we gather experience)
 
All of this will certainly have to be prototyped, as it is currently not clear how clean the separation of DSL and model API can be made. For instance, it is quite useful to have these methods on the same type:

Row.IS_NULL() // DSL
Row.fields() // Model

Field.IS_NULL() // DSL
Field.getType() // Model

A clean alternative would of course be:

Row.IS_NULL() // DSL
Row.model().fields() // Model

Field.IS_NULL() // DSL
Field.model().getType() // Model

Or some method other than model() to keep an open door for Oracle's MODEL clause.

For example, my very own set of criteria would be:
- all caps.
- union of strict APIs for SQLServer 2008+ and Oracle 10.

That way, the API would not simutate anything (strict SQL) and I would be able to call SQLServer or Oracle specific functions conditionally when I need. Of course, I would have to be careful not to call Oracle-specific stuff when on SQLServer, but I would likely generate the strict API of each just to keep the corresponding documentations and railroad diagrams at hand.

So, you think that an API union of strict APIs is useful? Or did you mean an API intersection (removing features that aren't available in *both* databases)?

According to your use-case, would API union / intersection include data types? How about stored procedures? There will be caveats, of course :-)

Cheers
Lukas

Christopher Deckers

unread,
Jun 25, 2013, 4:18:43 AM6/25/13
to jooq...@googlegroups.com
Hi Lukas,

 
Nice. So the API generation configuration dimensions are:
- Dialect support
- Strictness
- API naming strategies

In fact, I wonder if there should not be an additional item:
- Target language of choice

If you specify "Java" as your language criteria, keywords and operators may be different than if you say "Scala". This also has something do to with language-specific reserved keywords and syntax sugar.

 
Note, I'm thinking about distinguishing "strictness" between
- No simulation (strict)
- "simple" simulation: syntactic equivalence, such as NVL(a, b) = CASE WHEN a IS NOT NULL THEN a ELSE b END
- "transformative" simulation: transforming SQL to form very different SQL, e.g. LIMIT / OFFSET in Oracle, SQLServer 2008, LPAD() in SQLite (see other thread), or row value expressions and IN predicates:

Yes, this probably makes sense though I fear we developers are going to argue what is considered "simple" :)

 
I'm even dreaming of putting RRDiagram SVGs into the generated Javadoc :-)

Ah, if only our free time were infinite and conversion of DSL to BNF be an easy problem! ;)

 
All of this will certainly have to be prototyped, as it is currently not clear how clean the separation of DSL and model API can be made.

Yes, this might be the only way to find the right solution.

I currently don't know whether the model API should be somewhat exposed or if it can only be an internal thing, so that the exposed facade is the only API. I tend to think that the model should be hidden as an internal API, after all, all jOOQ objects are interfaces (Row, Field, etc.) which could call that internal immutable API in their implementation.

 
So, you think that an API union of strict APIs is useful?

I really meant union :)

jOOQ defaults to a union of all possible dialects, but in my case I just want to restrict to the 2 (or more) database dialects that we use. That way, we use the common subset, except when we have an issue and make a switch/case on the dialect to perform dialect-specific operations. Having this union means I would remove the noise of dialects that we definitely not care about.


According to your use-case, would API union / intersection include data types?

It should :)
But this multi-dialect handling in jOOQ is not clear for me, as I mentioned in an earlier post.
You could also consider that multi-dialect is a marginal case that you don't want to support, to reduce the issue to one or all dialects.

 
How about stored procedures? There will be caveats, of course :-)

I know, but in case of doubt, well, ignore the criteria.


Speaking about the model, when a user decides to not use the default facade, they would probably have to regenerate their model using their criterias. Generated model and DSL have to be in sync.

Cheers,
-Christopher

Durchholz, Joachim

unread,
Jun 25, 2013, 4:45:17 AM6/25/13
to jooq...@googlegroups.com
> So, you think that an API union of strict APIs
> is useful? Or did you mean an API intersection
> (removing features that aren't available in *both*
> databases)?

My wishlist item for this would be:

The intersection of exactly those dialects I care about, plus anything that can be emulated without resorting to potentially surprising behaviour.

I'm not sure how to organize this, much less implement it; it's definitely an architectural challenge to cleanly cater for all 2^N subsets of N supported dialects, brute-force solutions won't work for that.

Maybe it's enough to offer diagnostics. Say, whenever Jooq is configured for dialect, it accepts the dialect to use, and the set of dialects to restrict its API to. That way, Jooq can emit diagnostics if the application tries things that happen to work with the currently selected dialect but wouldn't for other dialects that the application is supposed to be compatible with.
Just playing with ideas here...

Lukas Eder

unread,
Jun 25, 2013, 5:02:28 AM6/25/13
to jooq...@googlegroups.com



2013/6/25 Durchholz, Joachim <Joachim....@hennig-fahrzeugteile.de>

> So, you think that an API union of strict APIs
> is useful? Or did you mean an API intersection
> (removing features that aren't available in *both*
> databases)?

My wishlist item for this would be:

The intersection of exactly those dialects I care about, plus anything that can be emulated without resorting to potentially surprising behaviour.

Yes. Pick MySQL+Oracle dialects (for instance), choose "transformative" strictness, choose API intersection (not union, in this case), generate.
 
I'm not sure how to organize this, much less implement it; it's definitely an architectural challenge to cleanly cater for all 2^N subsets of N supported dialects, brute-force solutions won't work for that.

Should be easy. All relevant methods are already annotated with @Support({ dialect1, dialect2 }). With API intersection, both MySQL+Oracle must be contained in a method's @Support for the method to make it into the API. With API union, one of MySQL, Oracle must be contained.

@Support currently only documents "transformative" strictness. It will be enhanced to distinguish between 3 strictness levels:
 
Maybe it's enough to offer diagnostics. Say, whenever Jooq is configured for dialect, it accepts the dialect to use, and the set of dialects to restrict its API to. That way, Jooq can emit diagnostics if the application tries things that happen to work with the currently selected dialect but wouldn't for other dialects that the application is supposed to be compatible with.

The plans being discussed here will allow for very precise compile-time checking of whether a clause / expression is supported by a set of dialects (except when choosing API union). There is no way to "try things" that will not work, unless I'm misunderstanding something?

Durchholz, Joachim

unread,
Jun 25, 2013, 5:12:03 AM6/25/13
to jooq...@googlegroups.com
> The plans being discussed here will allow for very
> precise compile-time checking of whether a clause
> / expression is supported by a set of dialects
> (except when choosing API union). There is no way
> to "try things" that will not work, unless I'm
> misunderstanding something?

Ad-hoc generation of the relevant subset of the API?
Sure sounds cool, but also sounds scary.
Also sounds like it's going to complicate the build process. Building is already a complicated thing in Java land; whatever the build activity, it needs to be easy to integrate into Ant, Maven, Gradle, SBT, and (hopefully) all the other cool build tools that will come along in the future. Oh, and it needs to integrate seamlessly with IDEs, and in my experience, that's the hardest subtask for getting the build under control - configuring Eclipse for an externally-run build step is clunky and not the kind of expertise that the average Eclipse user has.

(I might be missing the point since I obviously missed that part of the discussion.)

Lukas Eder

unread,
Jun 25, 2013, 1:02:20 PM6/25/13
to jooq...@googlegroups.com
Hi Christopher,
2013/6/25 Christopher Deckers <chr...@gmail.com>

Hi Lukas,
Nice. So the API generation configuration dimensions are:
- Dialect support
- Strictness
- API naming strategies

In fact, I wonder if there should not be an additional item:
- Target language of choice
If you specify "Java" as your language criteria, keywords and operators may be different than if you say "Scala". This also has something do to with language-specific reserved keywords and syntax sugar.
 
That would be nice. In fact, it would be very useful, if jOOQ Scala APIs could explicitly overload operators, instead of relying on implicit defs in a separate Java to Scala conversion object, as it is implemented now.
 
Of course, it would be good to think about other Scala language elements, which could be leveraged that way, e.g. the use of Field[Option[Int]] for what is a nullable Field[Integer] in Java, e.g. the use of Scala collections instead of Java ones, etc.
 
But I think that will be food for thought for jOOQ 5.0. :-)
 
Note, I'm thinking about distinguishing "strictness" between
- No simulation (strict)
- "simple" simulation: syntactic equivalence, such as NVL(a, b) = CASE WHEN a IS NOT NULL THEN a ELSE b END
- "transformative" simulation: transforming SQL to form very different SQL, e.g. LIMIT / OFFSET in Oracle, SQLServer 2008, LPAD() in SQLite (see other thread), or row value expressions and IN predicates:
Yes, this probably makes sense though I fear we developers are going to argue what is considered "simple" :)
 
You're a wise man :-)
How about
 
In other words, "simple" simulation must not repeat any expressions from the original clause / function. This would make my previous "simple" example a "transformative" one. It would be a useful distiction, as users will have to be more careful when using "transformative" simulations, rather than "simple" ones.
 
Or, as Roger Thomas put it, if RANDOM() cannot be passed to a function simulation, it is no longer "simple". See this other discussion about SQL function simulation for details:
 
I'm even dreaming of putting RRDiagram SVGs into the generated Javadoc :-)
Ah, if only our free time were infinite and conversion of DSL to BNF be an easy problem! ;)
 
Yeah. In that order! :-)
All of this will certainly have to be prototyped, as it is currently not clear how clean the separation of DSL and model API can be made.

Yes, this might be the only way to find the right solution.

I currently don't know whether the model API should be somewhat exposed or if it can only be an internal thing, so that the exposed facade is the only API. I tend to think that the model should be hidden as an internal API, after all, all jOOQ objects are interfaces (Row, Field, etc.) which could call that internal immutable API in their implementation.
 
That's the question. Are Row, Field DSL or model API? Or both, as they are now? This has to be explored. I'm not 100% sure if a clean separation is:
 
- possible
- desireable
 
We'll see...
 
So, you think that an API union of strict APIs is useful?
 
I really meant union :)
jOOQ defaults to a union of all possible dialects, but in my case I just want to restrict to the 2 (or more) database dialects that we use. That way, we use the common subset, except when we have an issue and make a switch/case on the dialect to perform dialect-specific operations. Having this union means I would remove the noise of dialects that we definitely not care about.
 
I see, yes of course, this is a use-case. So when selecting dialects, one can specify whether union or intersection should be applied. Maybe, the strictness level can even be specified on a dialect-basis, e.g.
 
Oracle: Strict INTERSECT H2: Transformative
 
... if that makes sense?
 
According to your use-case, would API union / intersection include data types?

It should :)
But this multi-dialect handling in jOOQ is not clear for me, as I mentioned in an earlier post.
You could also consider that multi-dialect is a marginal case that you don't want to support, to reduce the issue to one or all dialects.
 
It's not so marginal. Many users will use H2 or HSQLDB as test databases, even if I wouldn't recommend it, personally. 
  
How about stored procedures? There will be caveats, of course :-)
I know, but in case of doubt, well, ignore the criteria.
Speaking about the model, when a user decides to not use the default facade, they would probably have to regenerate their model using their criterias. Generated model and DSL have to be in sync.
 
True. This is easy to check with the compiler.
 
Cheers
Lukas 

Lukas Eder

unread,
Jun 26, 2013, 2:19:54 AM6/26/13
to jooq...@googlegroups.com


2013/6/25 Durchholz, Joachim <Joachim....@hennig-fahrzeugteile.de>
I don't think we need to be scared, here.
 
1. The above wouldn't be necessary if adding jOOQ as a regular dependency. jOOQ will continue to ship with the DSL API already available, which is what most users will choose.
2. Those users who want to tweak the API / DSL will be ready for a one-shot effort, integrating jOOQ as a source dependency into their build. I suspect there will be Maven plugins to achieve building the required API parts, or APT. In any case, they will probably do this once, and put the generated sources and binary into a Maven repository, or something else.
 
jOOQ already has a code-generator for database objects. I don't think this one will be any harder to integrate.

Lukas Eder

unread,
Jun 26, 2013, 3:33:26 AM6/26/13
to jooq...@googlegroups.com, Christopher Deckers
Hi Christopher,

2013/6/25 Lukas Eder <lukas...@gmail.com>

Hi Christopher,
2013/6/25 Christopher Deckers <chr...@gmail.com>
Hi Lukas,
Nice. So the API generation configuration dimensions are:
- Dialect support
- Strictness
- API naming strategies

In fact, I wonder if there should not be an additional item:
- Target language of choice
If you specify "Java" as your language criteria, keywords and operators may be different than if you say "Scala". This also has something do to with language-specific reserved keywords and syntax sugar.
 
That would be nice. In fact, it would be very useful, if jOOQ Scala APIs could explicitly overload operators, instead of relying on implicit defs in a separate Java to Scala conversion object, as it is implemented now.
 
Of course, it would be good to think about other Scala language elements, which could be leveraged that way, e.g. the use of Field[Option[Int]] for what is a nullable Field[Integer] in Java, e.g. the use of Scala collections instead of Java ones, etc.
 
But I think that will be food for thought for jOOQ 5.0. :-)

Just a short note, why Scala would need to be looked at in a much broader context. Check out how sqltyped leverages Scala 2.10 Macros to produce typesafety from SQL strings (strings!)


This is quite impressive, although not very usable in its current form. E.g. SQL strings need to be constants (I think) for the Macro processor to be able to construct a compile-time AST. This makes dynamic SQL impossible. But for inline SQL, it could be quite powerful.

What I'm trying to say is, it may be worth evaluating to write scooq, instead of adding 1-2 utilities to jOOQ. Or since Scala is mainly functional, not OO, a "better" name might be scfq ;-)

Christopher Deckers

unread,
Jul 14, 2013, 12:05:23 PM7/14/13
to Lukas Eder, jooq...@googlegroups.com
Hi Lukas,

What I'm trying to say is, it may be worth evaluating to write scooq, instead of adding 1-2 utilities to jOOQ. Or since Scala is mainly functional, not OO, a "better" name might be scfq ;-)

Remember that Java 8 is coming and you may have to think of "jFQ" too :)


Cheers,
-Christopher

Lukas Eder

unread,
Jul 16, 2013, 3:11:43 PM7/16/13
to Christopher Deckers, jooq...@googlegroups.com



2013/7/14 Christopher Deckers <chr...@gmail.com>

Hi Lukas,

What I'm trying to say is, it may be worth evaluating to write scooq, instead of adding 1-2 utilities to jOOQ. Or since Scala is mainly functional, not OO, a "better" name might be scfq ;-)

Remember that Java 8 is coming and you may have to think of "jFQ" too :)

I've had a couple of chats about this topic with Thomas Müller, the maintainer of the H2 database and of JaQu [1]. He's really interested in that topic, thinking about adding functional-style predicates to his fluent SQL API. JaQu is already "more embedded" in Java than jOOQ, as it plays a lot with constructors, byte-code instrumentation (i.e. decompiling byte code to generate SQL) and the "double-curly-braces" technique.

I'm personally not a fan of these features at all. While they're fun to play around with, and fun to talk about with Thomas Müller, I really think that LINQ-style APIs will bring the next big impedance mismatch to us developers: The idea of being able to merge programming paradigms (OO, functional, declarative). In my opinion, specifically, SQL's declarative style should not be mixed with anything else, if you want predictable SQL.

It's perfectly OK, though, to apply functional mappers onto jOOQ Result objects, once they have been obtained from the database. Something like this is already possible today:

DSL.using(configuration)
    .select(...)
    .from(...)
    .fetch((Record r) -> transform(r));

Replace "(...) -> transform(r)" with your own lambda.

Cheers
Lukas

Nicholas Ustinov

unread,
Aug 7, 2013, 4:53:17 PM8/7/13
to jooq...@googlegroups.com, Lukas Eder
Hi!

What I'm trying to say is, it may be worth evaluating to write scooq, instead of adding 1-2 utilities to jOOQ. Or since Scala is mainly functional, not OO, a "better" name might be scfq ;-)
 
Well, any concrete plans?
We are continuing to evaluate jOOQ in our scala development and although for now we need only to get it working right and usable enough, but in foreseeable future we are interested in global improvements in jOOQ's scala implementation and we can help maybe in implementing smth.

Nikolay

Lukas Eder

unread,
Aug 8, 2013, 5:04:11 AM8/8/13
to Nicholas Ustinov, jooq...@googlegroups.com
Hi Nicholas,

2013/8/7 Nicholas Ustinov <nust...@gmail.com>
I personally think that jOOQ is the best way to write SQL in Java. And it will eventually also be the best way to write SQL in Scala and other JVM languages. But right now, I will not be able to spend much time on a professional long-term jOOQ / Scala integration as I have other priorities with jOOQ. This includes not being able to accept / maintain larger contributions in that area, right now.

It is certainly something to think about for 2014, though!
Reply all
Reply to author
Forward
0 new messages