Wrong SQL joins handling

50 views
Skip to first unread message

mura acc

unread,
Mar 4, 2024, 2:38:47 PMMar 4
to antlr-discussion

Hello, 
I am interested in the grammars supporting SQL. However, I found out that grammar has issues with the join clause. The simple statement "select * from tab_a inner join tab_b" is interpreted in the way that inner is an alias.

 

It seems to me that the solution can be to define INNER_JOIN in the lexer. But this would lead to something like:
SPACES: [ \t\r\n]+ -> channel(HIDDEN);

SINGLE_LINE_COMMENT: '--' ~('\r' | '\n')* NEWLINE_EOF                 -> channel(HIDDEN);
MULTI_LINE_COMMENT:  '/*' .*? '*/'                                    -> channel(HIDDEN);

WASTE: (SPACES | SINGLE_LINE_COMMENT | MULTI_LINE_COMMENT)+;

fragment NEWLINE_EOF    : NEWLINE | EOF;
fragment QUESTION_MARK  : '?';
fragment SIMPLE_LETTER  : [A-Z];
fragment FLOAT_FRAGMENT : UNSIGNED_INTEGER* '.'? UNSIGNED_INTEGER+;
fragment NEWLINE        : '\r'? '\n';
fragment SPACE          : [ \t];


INNER:                         'INNER';
JOIN:                         'JOIN';
INNER_JOIN: INNER (WASTE)? JOIN;

I don't know if this lexer rule is the proper way to define something like this. To be honest, I wonder how multi-token commands are parsed in general.

Thank you.


Best regards 

Mura.

William Fahle

unread,
Mar 6, 2024, 3:43:44 PMMar 6
to antlr-discussion
I believe an INNER JOIN has to be ON something. select * from tab_a inner join tab_b on column_1 works fine.

mura acc

unread,
Mar 7, 2024, 10:38:51 AMMar 7
to antlr-di...@googlegroups.com
Hello William, 
Thank you for your reply. It is not correct. I changed your example to a valid SQL: select * from tab_a inner join tab_b on column_1 = c2

This is the output:
(select_statement (select_only_statement (subquery (subquery_basic_elements (query_block select (selected_list *) (from_clause from (table_ref_list (table_ref (table_ref_aux (table_ref_aux_internal (dml_table_expression_clause (tableview_name (identifier (id_expression (regular_id tab_a)))))) (table_alias (identifier (id_expression (regular_id (non_reserved_keywords_pre12c inner)))))) (join_clause join (table_ref_aux (table_ref_aux_internal (dml_table_expression_clause (tableview_name (identifier (id_expression (regular_id tab_b))))))) (join_on_part on (condition (expression (logical_expression (unary_logical_expression (multiset_expression (relational_expression (relational_expression (compound_expression (concatenation (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id column_1)))))))))) (relational_operator =) (relational_expression (compound_expression (concatenation (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id c2))))))))))))))))))))))))))

The wrong part is: (table_alias (identifier (id_expression (regular_id (non_reserved_keywords_pre12c inner))))))

The INNER is parsed as a table alias, which is wrong. The same is true for the left join, LEFT, etc.

BTW, my "solution" is wrong; it won't work. Because this is valid SQL, too: select * from inner join tab_a on inner.id = tab_a.id; in this case, INNER is a table name in Oracle. INNER is not a reserved key. 

Thank you,

Mura.






st 6. 3. 2024 v 21:43 odesílatel William Fahle <bill...@gmail.com> napsal:
--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antlr-discussion/451ed906-14c9-4db1-8823-0b834f4d30f6n%40googlegroups.com.

Jeffrey Coffield

unread,
Mar 7, 2024, 11:38:35 AMMar 7
to antlr-di...@googlegroups.com
Hello,

In my experience "SQL" is one of the most non-standard "standards" I've seen, which is why I use JPA for all our projects. The persistence layer takes care of most of the differences in SQL dialects by converting JPQL to SQL based on what particular database the app connects to. This has (mostly) allowed changing databases without having to recode the queries.

For me, having a JPQL grammar would be more useful than a SQL grammar as any SQL grammar would pretty much have to be specific to a particular database. 

I realize that this is just my point of view but I would offer it as something to consider.

Good luck in your efforts,
Jeff.


Message has been deleted

Sébastien Frippiat

unread,
Apr 22, 2024, 9:01:00 AMApr 22
to antlr-discussion
Hello.

I just found out about this bug too. It seems to have been there for a long time without a fix: https://github.com/antlr/grammars-v4/issues/1726.

I used a quick hack in my code to bypass it, I'll see if it passed all my tests when I'll have finished integrating them in my project...

            // NOTE: this is a hack to handle implicit joins that are not recognized correctly by the parser
            //       see https://github.com/antlr/grammars-v4/issues/1726
            //           https://groups.google.com/g/antlr-discussion/c/aqiCWYX1AiM?pli=1
            if (sqlElement.Table.Alias != null &&
                sqlElement.Joins.Count > 0 &&
                sqlElement.Joins[0].JoinType == SqlJoinType.Implicit)
            {
                if (sqlElement.Table.Alias.ToSql().ToLower() == "[inner]")
                {
                    sqlElement.Table.Alias = null;
                    sqlElement.Joins[0].JoinType = SqlJoinType.Inner;
                }
                else if (sqlElement.Table.Alias.ToSql().ToLower() == "[left]")
                {
                    sqlElement.Table.Alias = null;
                    sqlElement.Joins[0].JoinType = SqlJoinType.ImplicitLeft;
                }
                else if (sqlElement.Table.Alias.ToSql().ToLower() == "[right]")
                {
                    sqlElement.Table.Alias = null;
                    sqlElement.Joins[0].JoinType = SqlJoinType.ImplicitRight;
Reply all
Reply to author
Forward
0 new messages