Unexpected behavior when matching field names using character that also matches the table name?

9 views
Skip to first unread message

Daniel Einspanjer

unread,
Jan 3, 2017, 11:10:36 AM1/3/17
to jOOQ User Group
I have a DB with some tables that have unusual name casing.  In many cases, the names look great for using AS_IS such as "AddedTimestamp" or "HostName", but the schema designers decided that anything in the set of tables related to the object system should be named "objSystem*".  So there is a table named "objSystem" as well as "objSystemCategory", and in the "objSystem" table, there is a field called "objSystemCategory_Id".



I started with a field mapping like this:
<field>
<expression>^(.*)$</expression>
<fieldMember>
<transform>LOWER</transform>
</fieldMember>
<fieldSetter>
<transform>AS_IS</transform>
<expression>set$1</expression>
</fieldSetter>
<fieldGetter>
<transform>AS_IS</transform>
<expression>get$1</expression>
</fieldGetter>
</field>

It works well for all the easy cases, but it ended up with getters and setters like "getobjSystemCategory_Id".

I tried to fix that by adding a special case for fields that start with a lower case o:
<field>
<expression>^o(.*)$</expression>
<fieldMember>
<transform>LOWER</transform>
</fieldMember>
<fieldSetter>
<transform>AS_IS</transform>
<expression>setO$1</expression>
</fieldSetter>
<fieldGetter>
<transform>AS_IS</transform>
<expression>getO$1</expression>
</fieldGetter>
</field>

But something odd happened here.  Suddenly all my members and accessors had the prefix "objsystem_".  For example, "objsystem_HostName", "getObjsystem_HostName", and "setObjsystem_objSystemCategory_Id".

It seems that if the first part of the expression explicitly matches the table name, then the table name is part of the capture, but if the expression doesn't explicitly match, the table name is omitted?

I'm working around this behavior by capturing everything up to the first period and throwing it away, but it feels odd.  Is this working as designed?
<field>
<expression>^(?:[^.]+\.)o(.*)$</expression>
<fieldMember>
<transform>LOWER</transform>
<expression>o$1</expression>
</fieldMember>
<fieldSetter>
<transform>AS_IS</transform>
<expression>setO$1</expression>
</fieldSetter>
<fieldGetter>
<transform>AS_IS</transform>
<expression>getO$1</expression>
</fieldGetter>
</field>

-Daniel

Lukas Eder

unread,
Jan 3, 2017, 11:23:19 AM1/3/17
to jooq...@googlegroups.com
Hi Daniel,

That's an interesting observation, thanks for sharing.

Indeed, all these expressions in the code generator configuration match either (in that order):

- The unqualified object name (org.jooq.util.Definition.getName())
- The fully qualified object name (org.jooq.util.Definition.getQualifiedName())

If the fully qualified object name is matched, then the expression will include the entire qualified name. The solution here is to explicitly match the qualified name only

   <expression>^.*?\.o([^.]*)$</expression>
I hope this helps,
Lukas


--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniel Einspanjer

unread,
Jan 3, 2017, 5:20:48 PM1/3/17
to jOOQ User Group
Well, the confirmation that there are two distinct modes of matching involved is good, at least I didn't surface some weird bug. :)

Since the logic tries to match the unqualified object name first, I believe it can be a bit confusing for people in a similar scenario to mine where things are going along fine with a simple match such as .*, but then if they happen to write a new pattern that also happens to match the beginning of a table name, the mode suddenly shifts in a way that is likely to generate unexpected code.

I realize that making any significant changes to this logic would be hard when backward compatibility is considered, but I would suggest considering one of the following potential improvements/changes:

1. Add a note to the manual discussing the behavior ( https://github.com/jOOQ/jOOQ/pull/5767 )
2. Add an optional configuration setting that can be set to match only the field name, always the table.field, or the current behavior as default.
3. Log an info message when falling back (maybe this is here and I just didn't see it?)
4. Make a separate element for matching the qualified name such as <qualifiedNameExpression/>
5. Change the fallback test that uses the qualified name to only fire if the pattern has the qualifier separator in it.

Lukas Eder

unread,
Jan 5, 2017, 5:36:38 AM1/5/17
to jooq...@googlegroups.com
Hi Daniel,

Thank you very much for your thorough analysis. Will comment inline

2017-01-03 23:20 GMT+01:00 Daniel Einspanjer <daniel.e...@designet.com>:
Well, the confirmation that there are two distinct modes of matching involved is good, at least I didn't surface some weird bug. :)

Since the logic tries to match the unqualified object name first, I believe it can be a bit confusing for people in a similar scenario to mine where things are going along fine with a simple match such as .*, but then if they happen to write a new pattern that also happens to match the beginning of a table name, the mode suddenly shifts in a way that is likely to generate unexpected code.

Yes, I understand where this is confusing. For a "while", these regexes - by accident - only match unqualified identifiers, when suddenly, they start matching also qualified identifiers, which is then unexpected.

I realize that making any significant changes to this logic would be hard when backward compatibility is considered,

Yes, we cannot change the default behaviour easily.
 
but I would suggest considering one of the following potential improvements/changes:

1. Add a note to the manual discussing the behavior ( https://github.com/jOOQ/jOOQ/pull/5767 )

Agreed. Thanks for the PR. Will comment directly in there.
 
2. Add an optional configuration setting that can be set to match only the field name, always the table.field, or the current behavior as default.

I think that too many settings will quickly become too complicated to maintain, document, integration test. But there's a better suggestion below.
 
3. Log an info message when falling back (maybe this is here and I just didn't see it?)

That's probably quite verbose for those people who intentionally match only qualified names (i.e. what you call the "fall back")
 
4. Make a separate element for matching the qualified name such as <qualifiedNameExpression/>

That's the best solution - although, rather than specifying multiple expressions (what would be the semantics of specifying them both in the same configuration?), I would prefer a global flag that governs all expressions' behaviour uniformly:

1. UNQUALIFIED_THEN_QUALIFIED (current behaviour, default)
2. UNQUALIFIED
3. QUALIFIED

If this works out nicely, then we can move the default to QUALIFIED in the next major release - which is probably a better default.

I've created a feature request for this:

5. Change the fallback test that uses the qualified name to only fire if the pattern has the qualifier separator in it.

That's an interesting idea, but it'll be hard to implement. The qualifier separator might just as well be a part of an identifier part. E.g., you could have a table like this:

- [Hello.World] (SQL Server quoted identifier)
- "Hello.World" (SQL standard quoted identifier)
- `Hello.World` (MySQL quoted identifier)

I think we'll have to live with the status quo for now with the two suggested improvements:

- Added documentation (immediately)
- New flag to explicitly change the behaviour (jOOQ 3.10+)

Thanks again for your valuable feedback!
Lukas
Reply all
Reply to author
Forward
0 new messages