Missing double quotes around table and column names

1,001 views
Skip to first unread message

Gavriel Fleischer

unread,
Dec 1, 2014, 6:06:18 AM12/1/14
to jooq...@googlegroups.com
Hi,

I'm using jooq  3.4.2 to generate queries for PostgreSQL DB. I set the dsl context to Dialect.POSTGRES. When I use the generated constants, everything is fine, jooq adds the necessary double quotes, but when I start to write "custom" sql, then there are no quotes.

For example:

dslContext.select(TABLE.COL1, TABLE.COL2).from(TABLE); // everything is with the double quotes

or:

SelectHavingStep<Record> t = dslContext.select(TABLE.DATE.substring(1, 4).concat(TABLE.DATE.substring(6, 2)).cast(Integer.class).as("yearMonth")).from(TABLE.as("t"));

generates:

cast((substring(cast("Table"."Date" as varchar), 1, 4) || substring(cast("Table"."Date" as varchar), 6, 2)) as int) as "yearMonth"

But when I try to use it, as a sub-select and I need to relate to the "yearMonth" column alias:

dslContext.select(field("yearMonth")).from(t.asTable("t"))

In the generated sql yearMonth is not in double quotes.

Is there a way to force the quotes also around the "custom" table and column aliases?

Lukas Eder

unread,
Dec 1, 2014, 6:17:08 AM12/1/14
to jooq...@googlegroups.com
Hello,

There's a subtle difference between

- DSL.field()
- DSL.fieldByName()

The first method is really useful for "plain SQL", i.e. you could as well just write arbitrary SQL in there:

DSL.field("(select 1 as x)");

What you're really looking for is the second method:

DSL.fieldByName("yearMonth");

Hope this helps
Lukas

--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alok Menghrajani

unread,
Dec 1, 2014, 1:54:55 PM12/1/14
to jooq...@googlegroups.com
On Mon, Dec 1, 2014 at 3:17 AM, Lukas Eder <lukas...@gmail.com> wrote:
> Hello,
>
> There's a subtle difference between
>
> - DSL.field()
> - DSL.fieldByName()

Can we make this a little less subtle by renaming DSL.field() to
something more explicit? Here are a few suggestions:
DSL.fieldRaw(), DSL.fieldUnescaped(), DSL.fieldRawSql() or
DSL.fieldUnescapedSql().

It will also solve the problem of people picking the first thing that
shows up & seems to work in their type ahead; forcing them to stop and
think about what they are doing.

Alok

Lukas Eder

unread,
Dec 2, 2014, 2:23:02 AM12/2/14
to jooq...@googlegroups.com
There is certainly room for improvement in that area. The two modes of operation are very similar in the way they're used throughout the API as both operations simply require String arguments, and String doesn't have have a semantics per se.

An example of where this is also subtle:

DSL.select()
   .from("plain SQL table") -- DSL.table("...") semantics

vs.

DSL.dropTable("tableByName") -- DSL.tableByName("...") semantics

One way to disambiguate this would be to deprecate the tableByName and fieldByName methods (and all methods that accept strings with a "Name" semantics), and offer alternative API methods accepting the org.jooq.Name type:

DSL.table(DSL.name("tableByName"));
DSL.field(DSL.name("fieldByName"), Integer.class);

This would make using the jOOQ API a bit more verbose, but it might still be acceptable.

Another option would be to change the default for RenderNameStyle from QUOTED to AS_IS. This is the setting that produces case-sensitive names in the first place. Changing that default would be very delicate and I'm not so sure if it's a good idea.

2014-12-01 19:54 GMT+01:00 Alok Menghrajani <al...@squareup.com>:
Can we make this a little less subtle by renaming DSL.field() to
something more explicit? Here are a few suggestions:
DSL.fieldRaw(), DSL.fieldUnescaped(), DSL.fieldRawSql() or
DSL.fieldUnescapedSql().

I think that the plain SQL method names should be as simple as possible. They are much more generally useful than the fieldByName / tableByName methods, which is why they deserve the short name. Anything that adds words to them will make them very hard to read.

It will also solve the problem of people picking the first thing that
shows up & seems to work in their type ahead; forcing them to stop and
think about what they are doing

I agree, but I'd like to take this opportunity to help people understand that most databases are case-insensitive by default, and that jOOQ is helping them getting case sensitivity right - because ultimately, that is what they want when columns are deserialised to Java again: Case sensitive column names.

Is there any way this can be improved via the jOOQ API.

Are there any other ideas in the group?

Cheers,
Lukas 

Gavriel Fleischer

unread,
Dec 2, 2014, 8:31:17 PM12/2/14
to jooq...@googlegroups.com
About the case sensitivity:

I would rather see the constants that are generated by jooq to have the same case as the table, column names are in the DB. Currently when I have a table called StatisticsByWebsite and a column WebsiteShortUrl, then the generated classname would be Statisticsbywebsite, that is hard to read, and not according to most people (at least that I know) Java coding standard. StatisticsByWensite would be a better class name.

As for the constants... well there it's not that clear, because on one hand constants usually have full capital name, on the other hand where they have words, then they are separated by underline, so the constants could be STATISTICS_BY_WEBSITE.WEBSITE_SHORT_URL.

Regards,
Gavriel

--
You received this message because you are subscribed to a topic in the Google Groups "jOOQ User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jooq-user/469obN2diSM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jooq-user+...@googlegroups.com.

Lukas Eder

unread,
Dec 3, 2014, 1:47:38 AM12/3/14
to jooq...@googlegroups.com
There are essentially two things:

1. What we discussed before are the actual table / column *names* as used in SQL strings
2. What you call "constants" now is what we call "identifiers" in the code generator.

The current default for the code generator's naming strategy grew historically:

- Classes are PascalCased
- Identifiers are UPPER_CASED_WITH_UNDERSCORE

When we designed this default, we didn't think of the possibility of databases like SQL Server, where people tend to PascalCase also their table / column names. It was designed for databases with UPPER_CASED_WITH_UNDERSCORE table / column names.

We're thinking about changing that default such that the identifiers will look exactly as in the database:

You can already achieve this today by writing your own naming strategy:

--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+...@googlegroups.com.

Alok Menghrajani

unread,
Dec 8, 2014, 6:37:37 PM12/8/14
to jooq...@googlegroups.com
On Mon, Dec 1, 2014 at 11:23 PM, Lukas Eder <lukas...@gmail.com> wrote:
> There is certainly room for improvement in that area. The two modes of
> operation are very similar in the way they're used throughout the API as
> both operations simply require String arguments, and String doesn't have
> have a semantics per se.

Correct. You could require the raw SQL statements to be wrapped in DSL.raw():

DSL.select().from(DSL.raw("plain SQL table"))

vs

DSL.drop("tableByName")

The advantage of doing this is that:
* Forgetting to do something (i.e. forgetting to call DSL.raw) does
not lead to potentially unsafe code.
* The code is more explicit. When you are reading the code, you know
what is going on.
* If DSL.raw is used consistently, it makes it easy to audit / setup a
commit hook / highlight things at code review time / etc.

> One way to disambiguate this would be to deprecate the tableByName and
> fieldByName methods (and all methods that accept strings with a "Name"
> semantics), and offer alternative API methods accepting the org.jooq.Name
> type:
>
> DSL.table(DSL.name("tableByName"));
> DSL.field(DSL.name("fieldByName"), Integer.class);

Would you keep the DSL.table(String), DSL.field(String) methods? If
yes, I think it's safer to do it the other way around for the reasons
I mentioned above.

> This would make using the jOOQ API a bit more verbose, but it might still be
> acceptable.

The API becomes a bit more verbose (an extra method call around a
bunch of (hopefully) static strings). However don't forget the API
becomes simpler (since you can deprecate/remove tableByName,
fieldByName, etc.). Overall it's a reduction in the cognitive work for
the library users.

Alok

Lukas Eder

unread,
Dec 9, 2014, 3:45:22 AM12/9/14
to jooq...@googlegroups.com
2014-12-09 0:37 GMT+01:00 Alok Menghrajani <al...@squareup.com>:
On Mon, Dec 1, 2014 at 11:23 PM, Lukas Eder <lukas...@gmail.com> wrote:
> There is certainly room for improvement in that area. The two modes of
> operation are very similar in the way they're used throughout the API as
> both operations simply require String arguments, and String doesn't have
> have a semantics per se.

Correct. You could require the raw SQL statements to be wrapped in DSL.raw():

DSL.select().from(DSL.raw("plain SQL table"))

"raw()" by itself would be insufficient, because there are currently four different types of objects that can be constructed this way: Schema, Table, Field, and Sequence. It would be inevitable to call those methods rawSchema(), rawTable(), rawField(), rawSequence() ... or just ... schema(), table(), field(), sequence() :-)

but see more on that below...

vs

DSL.drop("tableByName")

The advantage of doing this is that:
* Forgetting to do something (i.e. forgetting to call DSL.raw) does
not lead to potentially unsafe code.

That's a good point, although, you're inserting strings into the API. This has an implicit feel of "ooh, am I doing this right?" I agree, though that terms like "raw" would make users even more alert.
 
* The code is more explicit. When you are reading the code, you know
what is going on.

I personally think that explicitness is most useful with the "*ByName" semantics where Strings must only contain schema/table/field/sequence identifiers, which is a big feature reduction compared to raw/plain SQL from a cognitive perspective. Many people currently get this wrong.

With plain SQL, you will always know what's going on when reading the code, though, I suspect? E.g. these two:

DSL.using(configuration)
   .select(count(VBL_DEFAULT.ID))
   .from(table("vbl_default partition(p_04-Dec-14)"))
   .where(...);
and

DSL.using(configuration)
   .select(count(VBL_DEFAULT.ID))
   .from("vbl_default partition(p_04-Dec-14)")
   .where(...);
Specifically, the version where a String is passed to "from" seems particularly readable to me.

* If DSL.raw is used consistently, it makes it easy to audit / setup a
commit hook / highlight things at code review time / etc.

This could also be achieved with a new annotation, which could be checked via JSR-308. I've had a very interesting discussion with spec lead Mike Ernst recently, about the potential of JSR-308 validation of the current @Support annotations to check that only API supported by the underlying database is used. He was absolutely thrilled by the idea of doing this in jOOQ, he almost forgot his dinner ;-)

In any case, I've registered an issue for adding @PlainSQL annotations:

No matter how we evolve this, it would be useful to annotated potentially dangerous methods this way. We have already added the same SQL injection warning in all Javadocs, but formal client-side validation might be even better. We'll be doing some research and blog about JSR-308 pretty soon...

> One way to disambiguate this would be to deprecate the tableByName and
> fieldByName methods (and all methods that accept strings with a "Name"
> semantics), and offer alternative API methods accepting the org.jooq.Name
> type:
>
> DSL.table(DSL.name("tableByName"));
> DSL.field(DSL.name("fieldByName"), Integer.class);

Would you keep the DSL.table(String), DSL.field(String) methods?

Yes, probably.
 
If yes, I think it's safer to do it the other way around for the reasons
I mentioned above.

Maybe, but it would be a heavy change and chances are that once you do need some String version of an object (raw or just the name), that the name will not be sufficient for very long and you will have to switch again to raw SQL.

If we fusion your ideas with mine, the safest and most thorough thing would be to create AST elements for everything and to gently remove the current "dangerous" convenience methods (although I will think about this thoroughly, as it is really a big change).

Pseudo-API:

class DSL {

    // create names for use with any other AST element
    static Name name(String... qualifiedName) { ... }

    // create plain SQL for use with any other AST element
    static SQL sql(String sql) { ... }
    static SQL sql(String sql, Object... bindings) { ... }
    static SQL sql(String sql, QueryPart... parts) { ... }

    // Use Name or SQL in typed AST elements
    static Field<Object> field(Name name) { ... }
    static Field<Object> field(SQL sql) { ... }
    static Table<?> table(Name name) { ... }
    static Table<?> table(SQL sql) { ... }

    static DropTableStep dropTable(Table<?> table) { ... }
    static DropTableStep dropTable(Name name) { ... }
}

interface SelectWhereStep {
    SelectConditionStep where(SQL sql) { ... }
}

interface Name extends QueryPart { ... }
interface SQL extends QueryPart { ... }

Marked in yellow: The only API elements that really take plain SQL arguments - just like your "DSL.raw()".

The advantage of this new SQL type would be the fact that it can immediately be embedded in other plain SQL templates, as in

QueryPart anyExpression = DSL.sql("abc + xyz");
DSL.field(DSL.sql("some_function({0})", anyExpression));

The same way that Name can already be embedded, or even keywords via DSL.keyword()

> This would make using the jOOQ API a bit more verbose, but it might still be
> acceptable.

The API becomes a bit more verbose (an extra method call around a
bunch of (hopefully) static strings). However don't forget the API
becomes simpler (since you can deprecate/remove tableByName,
fieldByName, etc.). Overall it's a reduction in the cognitive work for
the library users.

Exactly. This confusion popped up too many times, which means that the current status quo is more than insufficient from a "cognitive work" perspective.

Even if maybe plain SQL should be restricted to only very few methods, all Fields should be obtained via overloaded DSL.field(...) methods, Tables should be obtained via overloaded DSL.table(...) methods.

Lukas Eder

unread,
Dec 9, 2014, 3:45:54 AM12/9/14
to jooq...@googlegroups.com
... btw, thanks for keeping with us in that discussion. This has been very fruitful

Lukas Eder

unread,
Dec 9, 2014, 4:24:35 AM12/9/14
to jooq...@googlegroups.com
In fact... another possibility to get confused between String (as in *ByName), String (as in Plain SQL) is the String (as in bind value) type. With all the convenience methods that allow for String arguments, the ordinary String binding to a Field's <T> type can be yet again, confusing.

I guess that in the long run, the ideal solution is really to accept the fact that the existing convenience is misleading and inconsistent, and that AST elements simply always have to be wrapped in AST construction methods, while the String type is reserved to bind values for Field<String>.

2014-12-09 9:45 GMT+01:00 Lukas Eder <lukas...@gmail.com>:

Alok Menghrajani

unread,
Dec 9, 2014, 1:06:35 PM12/9/14
to jooq...@googlegroups.com
On Tue, Dec 9, 2014 at 12:45 AM, Lukas Eder <lukas...@gmail.com> wrote:
>
>
> 2014-12-09 0:37 GMT+01:00 Alok Menghrajani <al...@squareup.com>:
>>
>> On Mon, Dec 1, 2014 at 11:23 PM, Lukas Eder <lukas...@gmail.com> wrote:
>> > There is certainly room for improvement in that area. The two modes of
>> > operation are very similar in the way they're used throughout the API as
>> > both operations simply require String arguments, and String doesn't have
>> > have a semantics per se.
>>
>> Correct. You could require the raw SQL statements to be wrapped in
>> DSL.raw():
>>
>> DSL.select().from(DSL.raw("plain SQL table"))
>
>
> "raw()" by itself would be insufficient, because there are currently four
> different types of objects that can be constructed this way: Schema, Table,
> Field, and Sequence. It would be inevitable to call those methods
> rawSchema(), rawTable(), rawField(), rawSequence() ... or just ... schema(),
> table(), field(), sequence() :-)

DSL.raw() does not need to actually return an escaped string. It could
return an object, something like:

class RawString {
private String raw;

public RawString(String raw) {
this.raw = raw;
}

public String escapeForSchema() {
...
}
public String escapeForTable() {
...
}
etc.
}

RawString's escape methods would then get called by the API as needed.
DSL.raw() could be seen as a promise to escape your string, once the
context is known.

> but see more on that below...
>
>> vs
>>
>> DSL.drop("tableByName")
>>
>> The advantage of doing this is that:
>> * Forgetting to do something (i.e. forgetting to call DSL.raw) does
>> not lead to potentially unsafe code.
>
>
> That's a good point, although, you're inserting strings into the API. This
> has an implicit feel of "ooh, am I doing this right?" I agree, though that
> terms like "raw" would make users even more alert.

I'm glad we agree on the same things :)

> I personally think that explicitness is most useful with the "*ByName"
> semantics where Strings must only contain schema/table/field/sequence
> identifiers, which is a big feature reduction compared to raw/plain SQL from
> a cognitive perspective. Many people currently get this wrong.
>
> With plain SQL, you will always know what's going on when reading the code,
> though, I suspect? E.g. these two:
>
> DSL.using(configuration)
> .select(count(VBL_DEFAULT.ID))
> .from(table("vbl_default partition(p_04-Dec-14)"))
> .where(...);
>
> and
>
> DSL.using(configuration)
> .select(count(VBL_DEFAULT.ID))
> .from("vbl_default partition(p_04-Dec-14)")
> .where(...);
>
> Specifically, the version where a String is passed to "from" seems
> particularly readable to me.

To me, it's unclear in which case it's ok to use externally controlled
strings and in which case it's not ok. I would love to get the opinion
of all the other people on this list.

>> * If DSL.raw is used consistently, it makes it easy to audit / setup a
>> commit hook / highlight things at code review time / etc.
>
>
> This could also be achieved with a new annotation, which could be checked
> via JSR-308. I've had a very interesting discussion with spec lead Mike
> Ernst recently, about the potential of JSR-308 validation of the current
> @Support annotations to check that only API supported by the underlying
> database is used. He was absolutely thrilled by the idea of doing this in
> jOOQ, he almost forgot his dinner ;-)
>
> In any case, I've registered an issue for adding @PlainSQL annotations:
> https://github.com/jOOQ/jOOQ/issues/3851
>
> No matter how we evolve this, it would be useful to annotated potentially
> dangerous methods this way. We have already added the same SQL injection
> warning in all Javadocs, but formal client-side validation might be even
> better. We'll be doing some research and blog about JSR-308 pretty soon...

http://bunkstrutts.files.wordpress.com/2011/11/corgi-bounci.gif

>> > One way to disambiguate this would be to deprecate the tableByName and
>> > fieldByName methods (and all methods that accept strings with a "Name"
>> > semantics), and offer alternative API methods accepting the
>> > org.jooq.Name
>> > type:
>> >
>> > DSL.table(DSL.name("tableByName"));
>> > DSL.field(DSL.name("fieldByName"), Integer.class);
>>
>> Would you keep the DSL.table(String), DSL.field(String) methods?
>
>
> Yes, probably.
>
>>
>> If yes, I think it's safer to do it the other way around for the reasons
>> I mentioned above.
>
>
> Maybe, but it would be a heavy change and chances are that once you do need
> some String version of an object (raw or just the name), that the name will
> not be sufficient for very long and you will have to switch again to raw
> SQL.

Sorry, I lost you. I think it's fine to have raw SQL in the API, but
its use should be the exception rather than the rule.

> If we fusion your ideas with mine, the safest and most thorough thing would
> be to create AST elements for everything and to gently remove the current
> "dangerous" convenience methods (although I will think about this
> thoroughly, as it is really a big change).

You know jooq better than anyone else to know if this is the right
approach. It feels right to me, but you'll need to find a way to do
this gradually / without breaking too much existing code.
Alok

Lukas Eder

unread,
Dec 10, 2014, 5:21:08 AM12/10/14
to jooq...@googlegroups.com
2014-12-09 19:06 GMT+01:00 Alok Menghrajani <al...@squareup.com>:

DSL.raw() does not need to actually return an escaped string. It could
return an object, something like:

class RawString {
  private String raw;

  public RawString(String raw) {
    this.raw = raw;
  }

  public String escapeForSchema() {
    ...
  }
  public String escapeForTable() {
    ...
  }
  etc.
}

RawString's escape methods would then get called by the API as needed.
DSL.raw() could be seen as a promise to escape your string, once the
context is known.

While there can certainly be an object called something like RawString with no specific AST semantics, users will inevitably want to combine the type of object they had in mind with other operations. For example, they will want to compare plain SQL fields with other values, such as today:

DSL.field("some-complex-expression").eq(42)
 
From a usability perspective, it is crucial that the actual AST type be strongly typed towards Schema, Table, Field, Sequence semantics.

The semantics of escaping is already properly handled in jOOQ internally and shouldn't be exposed publicly. It won't add much value to API consumers, in my opinion.

> That's a good point, although, you're inserting strings into the API. This
> has an implicit feel of "ooh, am I doing this right?" I agree, though that
> terms like "raw" would make users even more alert.

I'm glad we agree on the same things :)

Yep :-)
 
Albeit, we won't introduce "raw()" as a method name as all of the manual and Javadoc refers to "raw" as "plain SQL". So the method will be called "DSL.sql()". There are also plans of supporting a "sql()" method in all of the DSL to inject local SQL strings for vendor-specific features or comments. So the two method names will be aligned.
 
To me, it's unclear in which case it's ok to use externally controlled
strings and in which case it's not ok. I would love to get the opinion
of all the other people on this list.

I agree, any opinion is welcome.
 
> No matter how we evolve this, it would be useful to annotated potentially
> dangerous methods this way. We have already added the same SQL injection
> warning in all Javadocs, but formal client-side validation might be even
> better. We'll be doing some research and blog about JSR-308 pretty soon...

http://bunkstrutts.files.wordpress.com/2011/11/corgi-bounci.gif

Well, that just about made my day ;-)

Cheers,
Lukas

Lukas Eder

unread,
May 9, 2016, 3:33:56 PM5/9/16
to jOOQ User Group
For the record, checker framework support is now available in jOOQ 3.9, although you can also manually backport the module to work with earlier releases. More details here:
Reply all
Reply to author
Forward
0 new messages