Record.getValue( ) can return wrong value when mixing table alias with table

239 views
Skip to first unread message

david....@gmail.com

unread,
Jul 17, 2015, 1:48:35 PM7/17/15
to jooq...@googlegroups.com
Hi,

Using Jooq 3.6.1, In a situation similar to the following:

<type> a = Tables.A.as("a");
<type> b = Tables.B.as("b");

Record record = ... .select(a.name, b.name) ...

record.getValue(Tables.B.name) was returning the value fetched for a.name.

I know it doesn't make sense that the code was using aliases during the select and not during getValue - however I feel it would be a good safety measure for jooq to not match anything at all in this case, or ideally throw an exception.

I haven't checked the result using Jooq 3.6.2 but the changelog doesn't seem to list anything relevant to this situation.

Cheers,
Dave

Lukas Eder

unread,
Jul 18, 2015, 8:25:43 AM7/18/15
to jooq...@googlegroups.com
Hello,

Thanks for your message. However, I'm not 100% sure how to understand the code examples. What's a.name / b.name / Tables.B.name? Is "name" a placeholder for a column name that is present in both A and B?

Cheers,
Lukas

--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

david....@gmail.com

unread,
Jul 18, 2015, 9:29:42 AM7/18/15
to jooq...@googlegroups.com
Yes - although in my case both tables had a column called 'name'.

The getValue call was iterating through the retrieved fields and returning the first thing called 'name'

Lukas Eder

unread,
Jul 20, 2015, 11:38:15 AM7/20/15
to jooq...@googlegroups.com
Oh, I see - thanks for the clarification, so I understood correctly.
The rationale behind the existing feature is to allow for "unknown" table aliases to be applied (e.g. when using derived tables) and to still use the generated column literals to extract values for columns of such tables. Example:

SELECT t.x, t.y
FROM (
    SELECT some_table.x, some_table.y
    FROM ...
) t

In the above example, you don't need to construct a formal t.x or t.y column expression to extract x or y values from the result. You can continue using SOME_TABLE.A generated literals. The algorithm works such that exact matches (table AND column names match) are preferred to "approximate" matches (only column names match).

As a side-effect, duplicate column names may produce unexpected behaviour as you have observed.

This issue has surfaced the user group on numerous occasions. It is actually not trivial to find the exact set of rules when exceptions should be thrown as ambiguous column names in top-level SELECTs are a tricky thing in SQL already. For instance, they're not allowed in nested selects... I'm very open to concrete suggestions for a set of rules, though

Cheers
Lukas

2015-07-18 15:29 GMT+02:00 <david....@gmail.com>:
Yes - although in my case both tables had a column called 'name'.

The getValue call was iterating through the retrieved fields and returning the first thing called 'name'
--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.

david....@gmail.com

unread,
Jul 31, 2015, 10:28:57 AM7/31/15
to jOOQ User Group, lukas...@gmail.com
Ah, thanks for the explanation.

Without having looked much into the jooq source code, I guess one strategy would be when exact match has failed and falling back to approximate matching, throw in the case that the approximate match is ambiguous. A little bit of overhead to check for this case perhaps, however it would defend against subtle bugs such as the one I found in our code. Perhaps given that the overhead wouldn't exist if exact matching succeeded, it's justifiable?

Cheers,
Dave

Lukas Eder

unread,
Aug 5, 2015, 8:17:33 AM8/5/15
to david....@gmail.com, jOOQ User Group
I suspect that the amount of confusion that this feature has caused in the past really justifies going a step further. I've created an issue for this:

The interesting thing is really that there are still several levels of approximation. For instance:

SELECT t.x, u.x
FROM (
    SELECT some_table.x
    FROM ...
) t, (
    SELECT another_table.x
    FROM ...
) u

In the above case, when looking for the some_table.x reference in the result, intuitively there is no ambiguity. Specifically because might even be an auto-generated table alias, which the jOOQ API user doesn't explicitly define as such.

This will need some further careful consideration before implementation.

Cheers,
Lukas

Lukas Eder

unread,
Aug 15, 2015, 7:38:57 AM8/15/15
to David Hogan, jOOQ User Group
Good news. In the context of an entirely different issue, I've now fixed this partially as can be seen here:


The new algorithm is:

- Return exact matches (schema / table / colum names match)
- Return qualified approximate matches (table / column names match)
- Return approximate matches (column names match)

While I'm still not 100% sure if we should go on with https://github.com/jOOQ/jOOQ/issues/4455, the above is certainly reasonable from a backwards-compatibility perspective.

Cheers,
Lukas

david...@360globalnet.com

unread,
Aug 15, 2015, 9:04:10 AM8/15/15
to jOOQ User Group, david....@gmail.com
Hi Lukas,

It's a shame that making it throw an exception when there is no exact match and more than one column match would not be backward compatible - could this behaviour be enabled with an option somewhere? It's certainly what i'd expect the database itself do if asked an ambiguous question. Perhaps when initiating jooq we could specify a compatibility level, allowing people to move to the newer behavior when it makes sense for them?

Cheers,
Dave

Lukas Eder

unread,
Aug 15, 2015, 9:07:25 AM8/15/15
to jooq...@googlegroups.com, David Hogan
Hi David,

I didn't say that - I just said that #4455 could be implemented with certainty. #4471 needs more thinking (in terms of API consistency). Adding an option for this particular case is ... not an option. :) Options are hard to maintain. I'd rather re-think this thoroughly and move on.

Cheers,
Lukas

david...@360globalnet.com

unread,
Aug 15, 2015, 1:30:46 PM8/15/15
to jOOQ User Group, david....@gmail.com
Hi Lukas,

Yeah, I just assumed because there no doubt a bunch of code out there with an ambiguous query that currently works only because the first match is the one they wanted. Introducing an exception for that case would break that code. I can appreciate the aversion to options but from the outside it looks like a choice between keeping the current (sort of dangerous) behavior or breaking compatibility with existing users. Another option you might consider - logging a warning for ambiguous queries. Wouldn't break compatibility but would allow people to clean up their queries.

Thanks for all the great work on the library so far by way, It's brilliant.

Cheers,
Dave

Lukas Eder

unread,
Aug 18, 2015, 5:22:16 AM8/18/15
to jooq...@googlegroups.com, David Hogan
2015-08-15 19:30 GMT+02:00 <david...@360globalnet.com>:
Hi Lukas,

Yeah, I just assumed because there no doubt a bunch of code out there with an ambiguous query that currently works only because the first match is the one they wanted.

Yep. As always when an API tries to be clever :)
 
Introducing an exception for that case would break that code. I can appreciate the aversion to options but from the outside it looks like a choice between keeping the current (sort of dangerous) behavior or breaking compatibility with existing users. Another option you might consider - logging a warning for ambiguous queries. Wouldn't break compatibility but would allow people to clean up their queries.

That's a good idea for a gentle move towards the breaking change, which could then be implemented in a major release. I have created:

 
Thanks for all the great work on the library so far by way, It's brilliant

Thank you very much for your nice words!

Cheers,
Lukas 
Reply all
Reply to author
Forward
0 new messages